I am puzzled by a set of frozen-core DF-CCSD(T) calculations of cyclopropylbenzene that in my hands produce wrong (T) corrections in the aug-cc-pVTZ basis. I believe the correct (T) contribution should be about -0.079 Hartree but I get about -0.125 Hartree with Psi4.
I am running single point energy calculations on MP2/aug-cc-pVTZ geometries for two minima (one of the inputs is attached) as well as for a rotational saddle point. Here is what I see:
The (T) correction in CCSD(T)/aug-cc-pVTZ, calculated without density fitting using Dalton is -0.079 Hartree.
DF-CCSD(T)/cc-pVTZ gives correct energies for all three stationary points, with (T) corrections about -0.077 Hartree
FNO-CCSD(T)/aug-cc-pVTZ with DF-CCSD gives correct energies for all three stationary points, with (T) corrections about -0.079 Hartree.
FNO-CCSD(T)/cc-pVQZ with DF-CCSD gives a correct energy for the saddle point, with (T) corrections about -0.083 Hartree; I have not checked the minima with cc-pVQZ. So far, so good
DF-CCSD(T)/aug-cc-pVTZ gives incorrect (T) energies for all three stationary points, with (T) corrections about -0.125 Hartree. The effect of (T) on the rotational barrier is now totally wrong (ca 2.5 kcal/mol vs 0.2 kcal/mol). SCF, E2 and CCSD energies are correct with the aug-cc-pVTZ basis.
Eliminating one or two of the smallest eigenvalues of the overlap matrix did not help.
Tightening various convergence criteria did not help.
It did not matter if I used 2, 4, or 6 threads.
Both the correct DF-CCSD(T)/cc-pVTZ and the incorrect DF-CCSD(T)/aug-cc-pVTZ calculations chose the LOW_MEM Wabef algorithm in DF-CCSD. All DF-CC calculations are running in C1 symmetry.
For a few other (simpler) molecules that I have tested, DF-CCSD(T)/aug-cc-pVTZ calculation seems to give correct (T) energies.
All Psi4 calculations are done with Psi4 1.1a2.dev323, which was compiled from git source with Intel Parallel Studio XE 2017, update 1 (icc 17.0.1 20161005) against MKL with -DMAX_AM_ERI=7. The calculations use Broadwell-E i7-6850K CPU on CentOS.
I am at my wit’s end why my DF-CCSD(T)/aug-cc-pVTZ calculation goes bananas for cyclopropylbenzene. Am I overlooking something about how to properly run DF-CCSD(T) calculations? Issues with my compiler or compilation flags? Still some linear dependencies in the overlap matrix? An unexpected feature in the DF-CCSD(T) code? An unexpected behavior of i7-6850K CPU? Any words of wisdom or notes about if this is reproducible on other systems are most welcome.
As for the MO overlap screening:
s_tolerance 1e-7 (no deletions) (T) = -0.126108
s_tolerance 5e-7 (1 deletion) (T) = -0.126365
s_tolerance 1e-6 (2 deletions) (T) = -0.123487
Woudn’t a serious linear dependency in the overlap matrix already manifest in poor SCF convergence and affect E(2) results? These are OK.
As for readers who are wondering about T1 diagnostic, it is 0.009814 regardless of how many MOs were dropped.
As for the FNO calculation, the attached input (with more memory) worked for me.
What module do you use for DF-CCSD(T)? Is it FNOCC or DFOCC? In psi4 both modules have DF-CCSD(T) codes. In the case of DFOCC module, if you are running multiprocessor jobs, then try serial jobs. It might be related with the OpenMP library that you are using.
In the case of DFOCC module, please send me your input and output files so that I can verify the problem and I can fix it if there is a bug.
Thank you for following up. Only DFOCC module causes trouble, FNOCC seems OK.
Looks like the issue also shows up in the DF-CCSD(T)/aug-cc-pvtz calculation of ethylbenzene; here Psi4 predicts that (T) correction is -0.11 Hartree while the correct value for ethylbenzene is closer to -0.07 Hartree. Benzene calculation, however, is OK in the same basis.
I have e-mailed you the pertinent cyclopropylbenzene files. It will be a few days before I can say anything about the serial result.
Thank you for sending your inputs. I have submitted cyclopropylbenzene jobs to reproduce the problem that you observed with 1-cpu and 6-cpu.
Meanwhile I have submitted DF-CCSD(T)/cc-pvdz calculation of ethylbenzene. With 1-cpu or 6-cpu the DFOCC module gives the same result [(T) = -0.04694602843749 au] with FNOCC module. Hence, there is no problem with the cc-pvdz basis. I will further investigate it with the aug-cc-pvtz basis set.
Thus, the problem also occurs in a single-thread calculation. With cc-pVTZ and aug-cc-pVDZ, DFOCC results for ethylbenzene look OK. Toluene in aug-cc-pvTZ also seems OK.
I am wondering:
i) What would the results look like for Psi4 compiled with ATLAS or OpenBLAS instead of MKL?
ii) Is there a way to print out separately
E4 doubles and triples
E5 singles and triples
contributions to the (T) correction when using the DFOCC module?
After some poking around, I noticed that DFOCC triples can be calculated using a few different algorithms. The incorrect results that I have seen so far are all based on the default algorithm, TRIPLES_IABC_TYPE = DISK.
When I tried TRIPLES_IABC_TYPE = DIRECT, a correct (T) contribution emerged for ethylbenzene in its non-planar conformation:
< TRIPLES_IABC_TYPE => DISK
< (T) Correction (a.u.) : -0.11066433016096
< DF-CCSD(T) Total Energy (a.u.) : -310.35121994181452
TRIPLES_IABC_TYPE => DIRECT
(T) Correction (a.u.) : -0.06993585855336
DF-CCSD(T) Total Energy (a.u.) : -310.31049147020696
Inspection of these algorithms in src/psi4/dfocc/ccsd_triples.cc leads me to think that maybe the span of (ia|bc) type integrals to be read from the disk is not correctly determined beyond a certain size of the integral file.
Here are two of additional observations that may help developers to troubleshoot what is going on with DFOCC triples:
The incorrect (T) contribution in DF-CCSD(T) when using disk-based (ia|bc) type integrals is not specific to substituted benzenes but occurs whenever the the system is larger than some critical size. For example:
Molecule Basis #BF-reg#BF-jkfit#BF-dfcc#ijk-comb (T)-correction
Benzene aug-cc-pvTZ 414 900 912 680 OK
Toluene aug-cc-pvTZ 506 1096 1110 1140 OK
Benzaldehyde aug-cc-pVTZ 506 1108 1124 1540 OK
Anisole aug-cc-pvTZ 552 1200 1216 1771 WRONG by 0.0166
Ethylbenzene cc-pVTZ 380 932 948 1771 OK
Ethylbenzene aug-cc-pVTZ 598 1292 1308 1771 WRONG by 0.0407
cPropylbenzene cc-pVTZ 410 1011 1029 2300 OK
cPropylbenzene aug-cc-pVTZ 644 1396 1414 2300 WRONG by 0.0423
Methyl Acetate aug-cc-pVTZ 368 796 806 680 OK
Methyl Acetate cc-pVQZ 455 836 990 680 OK
Methyl Acetate aug-cc-pVQZ 676 1166 1320 680 WRONG by 0.03
It seem that calculations with more than 1200 basis functions in DF-CC basis are problematic when the disk-based algorithm is used, These calculations are OK with the direct algorithm. which on my system is slightly slower than the disk-based algorithm. As a temporary work-around, maybe Psi4 should switch to direct triples in DFOCC when the number of DF-CC basis functions is larger than about 1100?
When I attempt TRIPLES_IABC_TYPE = INCORE calculations, I always get segmentation faults (core dump) for calculations that with the disk-based methods produce the wrong correction. However, for calculations that give the correct (T) correction, the use of in-core (ia|bc) type integrals is successful. I am pretty sure the segfaults in larger calculations are not due to insufficient memory; I am guessing that the system senses that (ia|bc) type integrals are out of the allowed bounds in RAM an crashes the job. I think the following set of dmesg errors are related to this:
psi4: segfault at 8 ip 00007faaac26b37f sp 00007ffe1e5d05c0 error 4 in core.so[7faaaaec0000+90f7000]
psi4: segfault at 7f313086f990 ip 00007f3c1f90a050 sp 00007fff793cd2d8 error 4
psi4: segfault at 7f315e00d510 ip 00007f3c1f90a050 sp 00007f3c3799c1d8 error 4
psi4: segfault at 7f313086f990 ip 00007f3c1f90a050 sp 00007f3c301992d8 error 4
Thank you very much for helping me to figure out the problem. As you noted the problem occurs for large molecular systems, this why I did not noticed it by now. There is noting wrong in my formulas. The problem is a simple integral overflow problem. In (T) there are some very large integral variables that are exceeding the size of c++ “int” variable. When I have just replace “int” variable type with “long int” the problem is solved. I will push a fix soon.
Thank you for fixing this. It is working well now. In my case, the disk-based calculation for these larger (500-600 basis functions) systems is about 8% faster than the direct calculation when using six MKL threads on i7-6850K CPU along with Intel 750 PCIe SSD.
P.S. Maybe you also want to update the “Latest Revision September 9, 2015” line?
I am glad that your problem is solved. I know that disk-based algorithm is faster, this why it is default. I will update the (T) code in the near future, the new code will include DF-CCSD(T) analytic gradients, too. At that time I will update all headers.