Possible performance regression with Coupled-DIIS in closed-shell OO-MP3 between 1.6 and 1.7

It was pointed out a few years ago that DF OCC orbital convergence was somewhat loose, and in Summer of 2022 an update to several files (I suspect that ‘occ_iterations.cc’ is the most relevant here) was done to fix this issue by (i) tightening the pertinent convergence criteria and (ii) introducing coupled DIIS. This landed in Psi 1.7 and as far as I can tell is in the same shape in Psi 1.8. Please see Coupled DIIS for `dfocc` · Issue #2215 · psi4/psi4 · GitHub for technicalities.

The coupled DIIS introduced in Psi 1.7 nicely cured some pathological convergence issues with open-shell molecules I had experienced in the past; it did not seem to hurt with OO-MP2 or OO-LCCD calculations. But I keep seeing significantly slower convergence in closed-shell OO-MP3 calculations with coupled DIIS. This is not just that we need more steps to reach the new tighter threshold, it takes many more steps to reach the old convergence criterion in Psi 1.7 compared to Psi 1.6.

I am wondering if this is just some fundamental peculiarity of closed-shell OMP3 (in which case, maybe we could have an input switch to turn off coupled DIIS), or could the program code in the closed-shell MP3 blocks be further improved to alleviate this issue? Here is a small test case:

# RHF OO-MP3/cc-pVTZ energy & gradient of cis-methyl acetate
# MP2/aug-cc-pVTZ minimum

memory 32000 mb

molecule {
  0 1
 C
 C 1 cc2
 O 1 oc3  2 occ3
 O 1 oc4  3 oco4  2  dih180
 C 4 co5  1 coc5  2  dih180
 H 5 hc6  4 hco6  1  dih180
 H 5 hc7  4 hco7  1  dih7
 H 5 hc7  4 hco7  1 -dih7
 H 2 hc9  1 hcc9  3  dih0
 H 2 hc10 1 hcc10 9  dih10
 H 2 hc10 1 hcc10 9 -dih10

 cc2=1.50008816
 oc3=1.21208376
 occ3=125.9531013
 oc4=1.35111759
 oco4=123.28591107
 co5=1.43640959
 coc5=114.05344724
 hc6=1.08451743
 hco6=105.44756365
 hc7=1.0876063
 hco7=110.3168028
 dih7=60.31709457
 hc9=1.08455439
 hcc9=109.42059662
 hc10=1.08844081
 hcc10=109.63897676
 dih10=120.99159672
 dih0=0.
 dih180=180.
  units angstrom
}

set globals {
 reference rhf
 qc_module occ
 scf_type df
 mp_type df
 basis cc-pVTZ
 freeze_core true
}

set occ {
 orb_opt true
}

gradient('omp3')

With Psi 1.6, I get at the last step:

<   19     -267.9336635755     -1.68e-09       3.26e-07         9.38e-06        5.15e-11

and with Psi 1.7, I get:

>   19     -267.9336624434     -3.12e-07       9.37e-06         2.99e-04        2.76e-10
...
>   58     -267.9336635756     -1.89e-10       2.65e-07         9.55e-06        3.71e-12
...
>   81     -267.9336635768     -7.16e-12       5.31e-08         1.98e-06        5.33e-13

So, it is 58 steps currently vs 19 steps in the older versions to reach the old threshold.

As a side note, it would be nice if keywords such as mo_maxiter and max_mograd_convergence worked within the set occ {} block; right now they are only obeyed when in globals {}.

Thank you for looking into this!

  1. I’m willing to look into this. Can you please provide a smaller example? This example is still large enough to be inconvenient for my investigations. Maybe you can replace some of those -CH3 groups with a hydrogen? I can work with this if I need to, but it’ll take longer.
  2. This is incredibly obnoxious, but that’s because you need to set a dfocc block, not an occ block. This makes perfect sense if you know how the code works, but is opaque from the documentation. The occ/dfocc documentation has been a problem spot for a long time. There’s another major issue in dfocc that needs to be addressed before anybody is willing to straighten out the documentation.
  1. Sure. Here is CO:
# RHF OO-MP3/cc-pVDZ energy & gradient of carbon monoxide

memory 32000 mb

molecule {
  0 1
 C
 O 1 co

 co=1.139
  units angstrom
}

set globals {
 reference rhf
 qc_module occ
 scf_type df
 mp_type df
 basis cc-pVDZ
 freeze_core true
}

set occ {
 orb_opt true
}

gradient('omp3')

converges in 16 steps with Psi 1.6. It takes 70 steps to get to the old threshold with Psi 1.7, and it takes 93 steps to converge to new criteria.

By the way, methyl acetate becomes manageable when you drop down to cc-pVDZ (now Psi 1.7 fails because 100 iterations is not enough). I just tried to avoid eye-rolling from experts when my input file has OMP3/cc-pVDZ directive (see “Convergence of third order correlation energy in atoms and molecules” for details)

If you find a quick fix, I am happy to run it through a larger set of cases in coming days as I am about to update my Psi4 and I am compiling from the source.

  1. Ha! Good to know. Feel free to reach out if you need help with documentation when the “another major issue in dfocc” gets fixed.