Gradient() memory usage

SCP4 · November 1, 2018, 7:59pm

Dear Psicoders,
Psi4 is a really great system. I’m using the latest stable release, 1.2 on ubuntu 16.04. I’m having some problems with CCSD(T) runs of the closed-shell ground state of the diatomic cation, CaF+. I need the multipoles through octupole or hexadecapole, and am going to need some external potential runs. CFOUR is a point of comparison. Using the EMSL basis sets cc-pCVTZ and cc-pCVQZ in CFOUR, and the supplied ones in PSI4 (which appear to be identical to the EMSL versions), multipoles can be computed with oeprop() preceded by energy() or by gradient().

the oeprop results after energy and gradient are different. Is it true that you have to run gradient to use analytical derivatives?
gradient is slow and uses a lot of memory. cc-pCVQZ will not complete even with 28 GB. Is there some change to be made to the input to allow it to do the calculation?
For the case where both programs complete, cc-pCVTZ, PSI4-gradient-oeprop and CFOUR have different quadrupoles, both of which should be analytical. However, the two basis set formats different in precision by 1 significant digit. That can be tested, but the test hasn’t been done yet.
Is there anything that can be done about the failure to complete due to memory for CaF+ with cc-pCVQZ?
(Dropbox link to info about the runs
https://www.dropbox.com/sh/4j3j1qjohhagyh0/AAD5ydgig7yUVnXJPkLBFmhra?dl=0
)
Data on PSI4 and CFOUR runs on CaF+
Thanks very much,
Steve Coy

loriab · November 2, 2018, 10:22am

I suspect that roughly what’s going on is as follows. oeprop() is the unified properties interface for most methods where the density is about as cheap as the energy. Then there’s coupled cluster, where one has to solve the lambda equations, doubling the cost. In order to drive this, CC properties should be accessed through properties().

On point 1, oeprop after energy is probably using HF, which is the most recent available density. Is there a label near the properties printout? We put that in to try to avert misinterpretation. Especially for CC, you’ll need a gradient-level calc for properties.

On point 3, Cfour is wonderfully able when it comes to CC higher excitations and properties. They do have CCSD(T) quadrupoles and octopoles, whereas Psi4 only has CCSD and only through quadrupole.

The people closer to the CC code can correct me on any of these points. @amjames @robertodr

ashutosh · November 2, 2018, 12:16pm

As @loriab said, you need to invoke a CCSD(T) gradient calculation for getting the one electron density and finally the properties. Running an CCSD(T) energy calculation doesn’t require building the density and thus the SCF density might have been used.

The memory requirements are of course quite high for conventional gradients. The gradient code (unlike energy) is not threaded making it slow as well. (We are planing to thread it soon, although that might require even more memory). If memory is your bottleneck, you might consider using the density fitted CCSD(T) gradients in psi4 implemented by @bozkaya.

SCP4 · November 2, 2018, 3:20pm

Thanks very much for the perspective.
My results with cc-pCVTZ seem to show that gradient() leads to the correlated analytical-gradient result for the dipole, and perhaps the quadrupole, while energy() does not, as you say. It seems sad that psi4 given 30GB cannot do the same with cc-pCVQZ for this simple closed shell diatomic (which CFOUR does with a fraction of the memory). (Note that the memory is very slowly nibbled away. Could there be a leak? Is this pre-auto*?) I will look at the density fitted CCSD(T) gradients by @bozkaya.
Best wishes
Steve Coy

SCP4 · November 3, 2018, 1:15pm

As a suggestion, the performance of the latest SSD’s makes storing it all in memory unnecessary. R/W is of nearly equivalent performance. (current drives about 3000 MB/s read or write (Samsung NVMe M.2)). It does require more coding or an access wrapper.

Diazonium · November 3, 2018, 9:31pm

I would caution people who use SSDs as a scratch drive for quantum chemistry. The performance is indeed excellent, even with good SATA SSDs, but the big problem is the limited program/erase durability of the flash memory chips used in SSDs.

We have had SSDs that have reached their rated total lifetime writes (300 TB) in 4 months, in 6 core machines. While typically good quality SSDs (especially the V-NAND Samsung SSDs that store only 2 bits per cell) can handle significantly more writes than their official rating, reliability is not guaranteed.

Therefore, unless one is using the Intel Optane SSDs (very expensive, but extremely fast and high endurance), it is advisable to keep a close eye on the total write counter of SSDs when they are being used for quantum chemistry.

SCP4 · November 4, 2018, 6:52pm

Thanks for your note on SSD’s, which is certainly the case, and widely reported. It now happens that a 280GB Intel Optane 900P in the 2.5" package is $250. It should be possible to put this into a Ubuntu system as a second drive and reconfigure the system with all the swap space there, and make all your runs with all your jobs and scratch space there, leaving the root drive alone to live long, If the scratch drive fails, replace it; the system is OK. If you need more memory than the board will hold, swap can do that for a cost in time. As you say, the TBW must be monitored on any SSD system.
Thanks again,
Steve Coy
(Nontheless, I still suspect a PSI4 bug in my current example)

bozkaya · November 5, 2018, 11:54am

I can say that DF-CCSD(T) gradients use much less memory comparing to the conventional one and significantly faster. You can try to use the DFOCC module. @SCP4