Help with mcscf convergence

I have experience with scf convergence but not with mcscf convergence, and since I’m starting to learn how to do casscf calculations, I would like to ask for some tips and advice on getting the mscsf to converge.

Here is my input:

molecule {
0 1
LI1
LI2 LI1 1.25
}

set {
basis def2-tzvppd
scf_type df
guess sad
e_convergence 8
d_convergence 8
docc = [2, 0, 0, 0, 0, 1, 0, 0 ]
active = [5, 0, 0, 0, 0, 5, 3, 3 ]
mcscf_maxiter 150
}

nrg, wfn = energy(‘casscf’, return_wfn=True)

I’m using the latest official release, v. 1.2.1. I have fiddled with the following keywords:

mcscf_diis_max_vecs
mcscf_algorithm
mcscf_guess
mcscf_max_rot

but can’t get convergence. The output looks like this:

    Iter         Total Energy       Delta E   Orb RMS    CI RMS  NCI NORB

@MCSCF 1: -14.747202187152 -4.4845e-02 1.86e-03 8.25e-07 8 1 Initial CI
@MCSCF 2: -14.747822251858 -6.2006e-04 2.95e-03 1.76e-06 9 1 TS
@MCSCF 3: -14.750749659836 -2.9274e-03 4.57e-03 2.88e-06 9 1 TS
Warning! Maxstep = 0.53, scaling to 0.40
@MCSCF 4: -14.763163367127 -1.2414e-02 7.38e-03 0.00e+00 12 1 TS, DIIS
@MCSCF 5: -14.770851477915 -7.6881e-03 8.82e-03 0.00e+00 12 1 TS, DIIS
Warning! Maxstep = 0.41, scaling to 0.40
@MCSCF 6: -14.767722215456 3.1293e-03 1.18e-02 0.00e+00 12 1 TS, DIIS
Warning! Maxstep = 0.81, scaling to 0.40
@MCSCF 7: -14.772742137647 -5.0199e-03 4.61e-03 0.00e+00 12 1 TS, DIIS
@MCSCF 8: -14.775008962244 -2.2668e-03 2.50e-03 0.00e+00 12 1 TS, DIIS
@MCSCF 9: -14.776136982925 -1.1280e-03 4.43e-03 0.00e+00 12 1 TS, DIIS
@MCSCF 10: -14.777304284512 -1.1673e-03 3.27e-03 0.00e+00 12 1 TS, DIIS
[omitting intermediate iters for brevity]
@MCSCF 140: -14.779945230548 -1.3116e-06 1.41e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 141: -14.779954395044 -9.1645e-06 1.42e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 142: -14.779958330966 -3.9359e-06 1.35e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 143: -14.779950735865 7.5951e-06 1.35e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 144: -14.779956146509 -5.4106e-06 1.71e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 145: -14.779952971461 3.1750e-06 1.82e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 146: -14.779991472404 -3.8501e-05 1.64e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 147: -14.779981025679 1.0447e-05 1.61e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 148: -14.780009679953 -2.8654e-05 2.09e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 149: -14.779991854334 1.7826e-05 2.14e-04 0.00e+00 12 1 TS, DIIS
@MCSCF 150: -14.780004212604 -1.2358e-05 2.08e-04 0.00e+00 12 1 TS, DIIS

Often after 150 iterations the orbitals RMS is only 1e-3, so I did a little better here using mcscf_max_rot=0.4, but I’m just stabbing in the dark at this point. When one has this kind of problem, what are the recommended things to try?

Always always always double-check your active space and your geometry. Always. I ran this computation myself, and it looks like the ordering of orbital energies in vanilla SCF is:

    Doubly Occupied:                                                      

       1Ag    -2.661020     1B1u   -2.548529     2Ag    -0.209733  

    Virtual:                                                              

       1B3u    0.005156     1B2u    0.005156     2B1u    0.011595  
       3Ag     0.049491     1B2g    0.059896     1B3g    0.059896  
       3B1u    0.063416     2B3u    0.078978     2B2u    0.078978  
       4Ag     0.123702     5Ag     0.160921     4B1u    0.180307  
       2B2g    0.182642     2B3g    0.182642     1B1g    0.326548  
       6Ag     0.326548     5B1u    0.348420     3B3u    0.402386  
       3B2u    0.402386     7Ag     0.434997     3B2g    0.441744  
       3B3g    0.441744     1Au     0.490063     6B1u    0.490063  
       7B1u    0.547246     4B3u    0.626574     4B2u    0.626574  
       8Ag     0.658857     4B2g    1.041967     4B3g    1.041967  
       8B1u    1.157239     9Ag     1.775201     9B1u    2.239306  
      10B1u    6.233364     5B3u    6.241537     5B2u    6.241537  
       5B2g    6.422021     5B3g    6.422021    10Ag     6.605620  
      11Ag    15.192822    11B1u   15.320472  

Psi itself reports your active space of choice as:

   ------------------------------------------------------------------------------
               Space    Total    Ag   B1g   B2g   B3g    Au   B1u   B2u   B3u
   ------------------------------------------------------------------------------
         Frozen DOCC        0     0     0     0     0     0     0     0     0
     Restricted DOCC        0     0     0     0     0     0     0     0     0
              Active       16     5     0     0     0     0     5     3     3
     Restricted UOCC       28     6     1     5     5     1     6     2     2
         Frozen UOCC        0     0     0     0     0     0     0     0     0
   ------------------------------------------------------------------------------

With no core orbitals, the key information is in the active section. Cross-referencing that against the Hartree-Fock orbitals, you’re skipping over two B2g/B3g pairs, a B1g/Ag pair, and another Ag to get a high-lying B1u, B3u, and B2u. I would need to work through the correlation tables to be sure, but I have a very strong suspicion that this does not make good chemical sense.

When I tried a CAS(6, 15) with the active space [5, 0, 1, 1, 0, 4, 2, 2], it still didn’t converge, but I got a lower energy (-14.78393), which tells me I’m on the right track. I tried removing the highest-energy orbital, the 4th B1u, and the computation still failed to converge, but with a much higher energy.

However, this is a strange geometry. A 1.25 separation for the lithium dimer is tiny. NIST tells me 2.7 angstroms is more realistic for the ground state. Did you mean to have the atoms that close together? This is causing problems even in a conventional SCF - there’s a UHF solution about 0.2 hartrees lower. It’s horrifically spin contaminated, of course, but that shouldn’t be there. My hunch is that it’s making your CASSCF harder than it should be.

Thank you for looking into this for me.

The geometry is correct, as this is part of a PES scan.

My choice of active space is actually based on the output of a Full-CI calculation, since the HF output is so unreliable (or so I’ve read, but it’s the easiest to obtain, so I appreciate why it is used).

Perhaps the problem is that I am misinterpreting the CI output, which is this:

The 20 most important determinants:

*   1    0.891692  ( 8043, 8043)  1AgX 2AgX 1B1uX 
*   2   -0.235707  ( 4172, 4172)  1AgX 1B1uX 1B3uX 
*   3   -0.235707  ( 5677, 5677)  1AgX 1B1uX 1B2uX 
*   4   -0.116541  ( 5677, 5799)  1AgX 1B1uX 1B2uA 2B2uB 
*   5   -0.116541  ( 5799, 5677)  1AgX 1B1uX 1B2uB 2B2uA 
*   6   -0.116541  ( 4172, 4319)  1AgX 1B1uX 1B3uA 2B3uB 
*   7   -0.116541  ( 4319, 4172)  1AgX 1B1uX 1B3uB 2B3uA 
*   8   -0.071485  (  411,  411)  1AgX 1B1uX 2B1uX 
*   9   -0.055456  ( 5799, 5799)  1AgX 1B1uX 2B2uX 
*  10   -0.055456  ( 4319, 4319)  1AgX 1B1uX 2B3uX 
*  11   -0.054345  (  411,  423)  1AgX 1B1uX 2B1uA 3B1uB 
*  12   -0.054345  (  423,  411)  1AgX 1B1uX 2B1uB 3B1uA 
*  13   -0.041745  (  423,  423)  1AgX 1B1uX 3B1uX 
*  14   -0.031678  ( 8043, 8044)  1AgX 2AgA 1B1uX 3AgB 
*  15   -0.031678  ( 8044, 8043)  1AgX 2AgB 1B1uX 3AgA 
*  16   -0.031254  (  411,  446)  1AgX 1B1uX 2B1uA 4B1uB 
*  17   -0.031254  (  446,  411)  1AgX 1B1uX 2B1uB 4B1uA 
*  18   -0.031249  ( 8043, 8049)  1AgX 2AgA 1B1uX 5AgB 
*  19   -0.031249  ( 8049, 8043)  1AgX 2AgB 1B1uX 5AgA 
*  20   -0.024409  (  423,  446)  1AgX 1B1uX 3B1uA 4B1uB 

No B[123]g or Au orbitals here, so I left them out of the active space. Is this wrong? If so, is there anything more reliable than the HF output? I should mention that I also tried using an MP2 wfn as the reference wfn for the CASSCF, and a CISD wfn, and even a CAS-CI wfn, but the CASSCF would not converge. I can’t tell whether it’s a problem with the active space or a problem with the reference wfn.

Thank for helping!

Yes, full CI is a much better way to determine what orbitals are likely to be significant than a Hartree-Fock computation. My usual approach is to do a full valence CAS and see what matters.

To put your numbers in perspective, if you compute full CI on water with 6-31G, your leading determinant has a coefficient of .978 and a weight (coefficient squared) of 0.956. The next most significant determinant has a coefficient of 0.053 and a weight of 0.003. Since water is most definitely a single reference problem, we can very safely say that you could have stopped including determinants long before you got to coefficients of 0.024409. Unless you want this MCSCF computation to specifically monitor the weight of determinants or have some very specific reason not to do so, cut down your active space to [2, 0, 0, 0, 0, 1, 1, 1] (weight >2%) or [2, 0, 0, 0, 0, 1, 2, 2] (weight > 1%). Having fewer determinants will make convergence much easier. If you want to do some correlated computation after this with Forte, definitely pare down the active space.

As for the missing B1g/B2g/B3g orbitals, you were right to exclude them. I was hasty and didn’t check the diagrams and symmetry arguments. Apologies if you have no clue what I’m talking about, but here’s “why” those orbitals don’t appear:
It looks like you want the ground state, which is singlet sigma plus gerade. If we assume the 1s bonding orbtial is occupied (1AgX) in all determinants, that doesn’t give us many ways to have any B_g orbitals occupied. We could have a A, a B1, a B2, and a B3, to maintain our sigma plus, but you can’t make gerade work. You could maybe have two electrons in the same B2 or B3, but then you either have both electrons in one spatial orbital in a spatially degenerate pair, or you fill both of those, which gives a negative bond order. Not favorable. You could send one electron to 1B2 and the other to 2B2, but excited antibonding orbitals? Also not likely. So on hindsight, it makes sense that B1g/B2g/B3g aren’t going to be important orbitals.

OK, thank you, I guess I just got too greedy in trying to use a big active space. It makes sense that reducing it will help convergence.

Looking at the CI coefficients for the determinants, it looks like at least the top 3 are important. Now my understanding is that CASSCF is not a multi-reference method, so if one does a CI calculation and finds more than 1 important determinant, then one should use a multi-reference method to model such a system. True?

Even if the answer to that question is not true, then I still want to ask what sort of multi-reference methods Psi4 can do. Apart from FCI, I see Mk-MRCC. Have I missed any others?

First, on the idea of choosing a big active space:
All of quantum chemistry is about convergence. We can’t practically choose an infinite basis set or all the excitations, so we have to instead find a way to get the relative energies “about right” with a finite basis and only some excitations. This means we have to balance our treatment of static and dynamic correlation. If we pick an active space that’s too big, we start treating dynamic correlation when we meant to just treat our static correlation. We’ll get some correlation “too soon” compared to in a balanced treatment, and our comparisons will be off. So if you want to do a more sophisticated multireference treatment after your CASSCF, there is such a thing as an active space that’s too large!

That brings me to the next point. CASSCF is a multireference method, but it’s the multireference version of Hartree-Fock. Just like Hartree-Fock is a good starting point for CI or coupled cluster methods but not something you’d trust quantitatively, CASSCF is a good starting point for MRCI, CASPT2, or one of the many multireference coupled clusters out there, but not something you should trust quantitatively.

Now for your system, you could use a multireference method, but you also have the option of just going out to high excitation levels. A “multireference problem” can still be captured by single reference methods, but it takes longer than normal to converge. I imagine that by the time you get to CCSDT(Q), you’ll pretty much have your answer. After all, you only have six electrons, and two of those don’t want to leave the core. You can check by looking at how the relative energy changes as you change the correlation level. I’ll warn you that Psi4 can’t do CCSDT(Q). You’d need MRCC or CFOUR for that. However, Psi4 interfaces with both of those, and they’re both free.

If you want to use a post-CASSCF multireference method in Psi4, you can either use Mk-MRCC or something in the Forte plugin. I’m not familiar with Forte, but the manual for that is here.

Good luck!

Uhm…

This input makes no sense to me. Lithium has 3 electrons, so Li2 has 6 electrons. However, you specify in the input that 3 orbitals are restricted to be doubly occupied. This should leave none for the CASSCF part.

I’m not 100% sure if you can treat the 1s in Li as core or not; if I were you I’d just include those in the active space and set

docc =   [0, 0, 0, 0, 0, 0, 0, 0 ]

Or is docc ignored in detci?

That struck me as strange at first too, but when I ran the input file, I found that DOCC was ignored during the CASSCF.