DETCI Segmentation fault for polycarbene computation

I am trying to run a CASSCF computation for this open-shell polycarbene species. It has 10 unpaired electrons.

I get a sementation fault during the setup of the CI strings.

I can run this calculation exactly as is in our CASSCF code, so I don’t think it is a problem with the input.

memory 10 gb
molecule carbene_5 {
0 1
C 0.0000000000 5.9214795174 9.3001889023
C 0.0000000000 7.7764762644 7.4269789748
C 0.0000000000 7.0785588292 4.8096893821
C 0.0000000000 4.4591187281 4.2215442482
C 0.0000000000 2.5538628248 6.1127538172
C 0.0000000000 3.3543262487 8.7005013213
H 0.0000000000 6.4895198549 11.2723392210
H 0.0000000000 9.7510057483 7.9618546187
H 0.0000000000 3.8920402612 2.2526801633
H 0.0000000000 1.9667494814 10.2035875985
C 0.0000000000 0.0000000000 5.3539456265
C 0.0000000000 -2.5538628248 6.1127538172
C 0.0000000000 -3.3543262487 8.7005013213
C 0.0000000000 -4.4591187281 4.2215442482
C 0.0000000000 -5.9214795174 9.3001889023
H 0.0000000000 -1.9667494814 10.2035875985
C 0.0000000000 -7.0785588292 4.8096893821
H 0.0000000000 -3.8920402612 2.2526801633
C 0.0000000000 -7.7764762644 7.4269789748
H 0.0000000000 -6.4895198549 11.2723392210
H 0.0000000000 -9.7510057483 7.9618546187
C 0.0000000000 8.8370151372 2.8082804336
C 0.0000000000 -8.8370151372 2.8082804336
C 0.0000000000 -11.4027492215 2.0904415055
C 0.0000000000 -12.0074993789 -0.5243687665
C 0.0000000000 -13.4586333052 3.8551546968
C 0.0000000000 -14.5393714607 -1.4230941699
H 0.0000000000 -10.4789414832 -1.8887264688
C 0.0000000000 -15.9512576649 2.9954578072
H 0.0000000000 -13.0841991896 5.8662503814
C 0.0000000000 -16.5244304986 0.4226939416
H 0.0000000000 -17.4819981944 4.3625745096
H 0.0000000000 -18.4803140548 -0.1770352134
C 0.0000000000 11.4027492215 2.0904415055
C 0.0000000000 13.4586333052 3.8551546968
C 0.0000000000 12.0074993789 -0.5243687665
C 0.0000000000 15.9512576649 2.9954578072
H 0.0000000000 13.0841991896 5.8662503814
C 0.0000000000 14.5393714607 -1.4230941699
H 0.0000000000 10.4789414832 -1.8887264688
C 0.0000000000 16.5244304986 0.4226939416
H 0.0000000000 17.4819981944 4.3625745096
H 0.0000000000 18.4803140548 -0.1770352134
C 0.0000000000 14.9635053736 -4.0493221673
C 0.0000000000 -14.9635053736 -4.0493221673
C 0.0000000000 16.7607237441 -6.0263593178
C 0.0000000000 15.8789435150 -8.5755034964
C 0.0000000000 19.4319839152 -5.6388690843
C 0.0000000000 17.5589742987 -10.5978109262
H 0.0000000000 13.8560181448 -8.9085223741
C 0.0000000000 21.0908119718 -7.6823849105
H 0.0000000000 20.1842234179 -3.7352081099
C 0.0000000000 20.1762827887 -10.1703189699
H 0.0000000000 16.8348312442 -12.5171604070
H 0.0000000000 23.1136334070 -7.3377253198
H 0.0000000000 21.4826258986 -11.7506799281
C 0.0000000000 -16.7607237441 -6.0263593178
C 0.0000000000 -15.8789435150 -8.5755034964
C 0.0000000000 -19.4319839152 -5.6388690843
C 0.0000000000 -17.5589742987 -10.5978109262
H 0.0000000000 -13.8560181448 -8.9085223741
C 0.0000000000 -21.0908119718 -7.6823849105
H 0.0000000000 -20.1842234179 -3.7352081099
C 0.0000000000 -20.1762827887 -10.1703189699
H 0.0000000000 -16.8348312442 -12.5171604070
H 0.0000000000 -23.1136334070 -7.3377253198
H 0.0000000000 -21.4826258986 -11.7506799281
units bohr
}

set globals{
reference rhf
basis 6-31g
df_basis_mp2 cc-pvdz-ri
scf_type df
RESTRICTED_DOCC [59, 8, 8,56]
active [3, 2, 3, 2 ]
mcscf_type df
mcscf_maxiter 100
calc_s_squared true
}
set_num_threads(8)
refscf, refwfn = energy(‘casscf’, return_wfn=True)

So I ran a gdb on this system. Here is the backtrace.

0 0x00007ffff212a8aa in _int_malloc () from /home/kannon/src/mvapich2-2.1/build/lib64/libmpi.so.12
1 0x00007ffff212b55a in malloc () from /home/kannon/src/mvapich2-2.1/build/lib64/libmpi.so.12
2 0x00007ffff1b54118 in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6
3 0x00007ffff1b541c9 in operator new[](unsigned long) () from /usr/lib64/libstdc++.so.6
4 0x0000000004be7659 in psi::PsiOutStream::Printf (this=0x6021c50, format=0x5160e08 “(subgr_lex_addr): Impossible walk!\n”)
at /home/kannon/src/psi4src/psi4/src/lib/libparallel/PsiOutStream.cc:78
5 0x000000000197c1c0 in psi::detci::subgr_lex_addr (head=0x7fff9ec04b40, occs=0x7fff9fc19640, nel=5, norb=10) at /home/kannon/src/psi4src/psi4/src/bin/detci/og_addr.cc:81
6 0x000000000197c3d5 in psi::detci::og_lex_addr (Graph=0x7fff9fc11f20, occs=0x7fff9fc19640, nel=5, listnum=0x7fff9fc07cc0)
at /home/kannon/src/psi4src/psi4/src/bin/detci/og_addr.cc:139
7 0x00000000019a324c in psi::detci::CIWavefunction::set_ciblks (this=0x7fff9fc10e10) at /home/kannon/src/psi4src/psi4/src/bin/detci/set_ciblks.cc:120
8 0x00000000019874f8 in psi::detci::CIWavefunction::form_strings (this=0x7fff9fc10e10) at /home/kannon/src/psi4src/psi4/src/bin/detci/olsengraph.cc:170
9 0x000000000197f2bb in psi::detci::CIWavefunction::common_init (this=0x7fff9fc10e10) at /home/kannon/src/psi4src/psi4/src/bin/detci/ciwave.cc:114

You got a problem here:

   DOCC            =  57  11  12  56
   SOCC            =   0   0   0   0

   FROZEN DOCC     =   0   0   0   0
   RESTRICTED DOCC =  59   8   8  56
   ACTIVE          =   3   2   3   2
   RESTRICTED UOCC = 109  30  31 110
   FROZEN UOCC     =   0   0   0   0

The CI parsing should have caught this and thrown an exception however. Regardless “(subgr_lex_addr): Impossible walk!\n” should be a red flag.

What is weird about this is that I do not get that “subgr_lex_addr” printing out anywhere. It seems that the segmentation fault happens before that string is printed out.

I will see if this fixes my problem.

Its in your backtrace. It looks like something is going wrong with MPI + PsiOutStream, can’t help you on that one.

Everything works well now.
I have another question related to this molecule.

How can I use high spin orbitals as an initial guess for a low spin calculation?

I tried running

molecule {
0 11
GEOM
}
set globals {
reference ROHF
}
refscf, refwfn = energy(‘scf’, return_wfn=True)

set detci reference RHF
set detci S 0.0
energy(‘casscf’, ref_wfn=refwfn)

Thats close, but we use doccpi and soccpi of the incoming wavefunction as an initial guess. You have to change those, but you can only do that at the C-layer at the moment I think. Kinda hacky, but the following could work:

reference ROHF
rohfscf, rohfwfn = energy('scf', return_wfn=True)

# set multiplicity to singlet
reference RHF
refscf, refwfn = energy('scf', return_wfn=True)

for h in range(refwfn.nirrep()):
    refwfn.Ca().np[h][:] = rofhwfn.Ca().np[h][:]

energy('casscf', ref_wfn=refwfn)

Just any way to copy the ROHF orbitals into a RHF wavefunction. Lots of duplication, but not enough accessors currently to set these things correctly python-side.

Ah, alright. Thank you! I will try this out. Could I not change docc and socc in the input?

Hmm, that may be worth a shot. I thought RASSET used more of the incoming information, but it takes most of it from globals.

The ras_set function (ras_set_3() by now? I forget) should always listen to user input over all else. If that isn’t happening, then it’s not working as intended. Grabbing stuff out of the passed-in wavefunction should only be the default for cases when the user hasn’t said what they want.

Kevin maybe you can play with setting docc and socc for the HF, then changing it before the CASSCF, and let us know what happens.

So I played around with a simpler test case.

This test case allows you to use ROHF orbitals as a guess for CASSCF. It does seem that setting docc and socc will allow this to work. However, if you do not set docc and socc, the code reverts back to the ROHF reference.

It is also interesting that DOCC does not actually matter. If I specify docc [2,0,0,0,0,0,0,0] or [1, 0, 0, 0, 1, 0, 0] or [0, 0, 0, 0, 0, 2, 0, 0] I always end up getting the same answer.
I have to say that socc are all 0, otherwise the code fails. This might be important to check in the input (if RHF and SOCC, throw a warning/error).

What is DOCC used for in this case? Just to get number of electrons? Maybe the code should read the RDOCC and active and check if that is equal to the number of electrons? It seems that the code reads DOCC to determine the number of electrons.

memory 1 GB
molecule HHeH {
0 3
H 0.00 0.00 -2.0
He 0.00 0.00 0.00
H 0.00 0.00 2.0
units angstrom
}

set global{
e_convergence 1e-10
d_convergence 1e-10
scf_type pk
basis def2-tzvp
reference ROHF
DOCC [1,0,0,0,0,0,0,0]
SOCC [1,0,0,0,0,1,0,0]
}

scf_rohf, rohf_wfn = energy(‘scf’, return_wfn=True)

set detci {
reference RHF
restricted_docc [1,0,0,0,0,0,0,0]
active [1,0,0,0,0,1,0,0]
docc [2,0,0,0,0,0,0,0]
socc [0,0,0,0,0,0,0,0]
S 0.0
}
energy(‘casscf’, ref_wfn=rohf_wfn)

The code is in lib/libqt/ras_set.cc feel free to make a PR with any suggested changes.

Hmmm, let me take a look at this in the next day or two. The ras_set code is a little non-trivial so probably I should take a look at it.

Thank you! I am not sure where the number of electrons are computed in detci and I don’t really know where the reference is read. I skimmed through the ras_set, but I did not see any reference to the number of electrons and/or ROHF/RHF.

I am wondering if we should try to simplify a bit the CASSCF input. Most codes (that have CASSCF) typically have an option for the number of active orbitals and the number of active electrons. RESTRICTED_DOCC and ACTIVE are good keywords to keep (and I feel that they are necessary for any CASSCF code), but it is difficult to figure out the number of electrons that are included in the active space until you run the CI code. Right now, I typically run a SCF calculation, figure out the RESTRICTED_DOCC and ACTIVE keywords then run a CASSCF calculation. I also make sure that the number of active electrons equals the number of electrons that I want to include in the active space by inspecting the detci output or reading the NO occupations.

It might be nice to have a check where the number of electrons the user wants equals what the CI program is doing. I can make these changes, but I am not sure where this logic should go in the CI code.