Calculations with custom basis sets run slower with multiple threads

Hi,

I’m running energy and gradient calculations with a custom basis set (user-defined *.gbs file). When running with multiple threads (8), the energy SCF shows a small speed up, but the SCF GRAD calculation takes much longer than running with a single thread. If I use a built-in basis (like aug-cc-pvdz), I don’t see this problem.

Output using a built-in basis (aug-cc-pvdz), 8 threads:

SCF
Module time:
    user time   =     191.44 seconds =       3.19 minutes
    system time =      17.54 seconds =       0.29 minutes
    total time  =         28 seconds =       0.47 minutes

SCF GRAD
Module time:
    user time   =      67.71 seconds =       1.13 minutes
    system time =       3.68 seconds =       0.06 minutes
    total time  =         13 seconds =       0.22 minutes

Output using a custom basis (user-defined *.gbs), 8 threads:

SCF
Module time:
    user time   =     809.34 seconds =      13.49 minutes
    system time =      12.01 seconds =       0.20 minutes
    total time  =        540 seconds =       9.00 minutes

SCF GRAD
Module time:
    user time   =    3412.75 seconds =      56.88 minutes
    system time =      14.06 seconds =       0.23 minutes
    total time  =       3043 seconds =      50.72 minutes

I’m running v1.4.1, but the problem also occurs for v1.5.

I’ve tried using a different custom basis set and it worked fine, so there must be something about my specific basis set that triggers the error. I’ve attached a basis set that can be used for water. Any help would be appreciated. Thank you.

spherical
****
H     0
S    10   1.00
   5485.0000                 0.7866305532E-05
    803.5000                 0.1005295712E-03
    175.7000                 0.4292187586E-03
     49.6700                 0.2200643087E-02
     14.5300                 0.8841925973E-02
      4.7790                 0.3294507491E-01
      1.6430                 0.1062259599E+00
      0.5976                 0.2801966826E+00
      0.2213                 0.4671465639E+00
      0.0834                 0.2454942559E+00
S    10   1.00
   5485.0000                 0.3164187581E-04
    803.5000                 0.3443747244E-03
    175.7000                 0.1764516658E-02
     49.6700                 0.7204531541E-02
     14.5300                 0.3639479164E-01
      4.7790                 0.1011289813E+00
      1.6430                 0.4394673698E+00
      0.5976                 0.6216447394E+00
      0.2213                -0.3927500880E+00
      0.0834                -0.6498385446E+00
S    10   1.00
   5485.0000                -0.2113496601E-02
    803.5000                 0.7788001380E-02
    175.7000                -0.3384743617E-01
     49.6700                 0.5534432944E-01
     14.5300                -0.3026864839E+00
      4.7790                 0.3857649550E+00
      1.6430                 0.1331777106E+01
      0.5976                -0.1529390499E+01
      0.2213                -0.4704435423E+00
      0.0834                 0.1020444790E+01
P    10   1.00
   5485.0000                -0.3953144014E-06
    803.5000                 0.1874336280E-05
    175.7000                -0.6531588558E-05
     49.6700                 0.1754756857E-04
     14.5300                -0.4630534015E-04
      4.7790                 0.1360128758E-03
      1.6430                 0.2868815848E+00
      0.5976                 0.5557583026E+00
      0.2213                 0.2864950363E+00
      0.0834                -0.2045185477E-04
P    10   1.00
   5485.0000                -0.5125171990E-02
    803.5000                -0.2441205404E-01
    175.7000                -0.5039057710E-01
     49.6700                -0.1319025123E+00
     14.5300                -0.2985422460E+00
      4.7790                -0.6061611919E+00
      1.6430                 0.4387209187E-01
      0.5976                 0.6711451011E+00
      0.2213                -0.2566304511E+00
      0.0834                -0.4339823222E+00
P    10   1.00
   5485.0000                 0.1941733156E-02
    803.5000                 0.7385234434E-02
    175.7000                 0.1969937929E-01
     49.6700                 0.3717438370E-01
     14.5300                 0.1169104114E+00
      4.7790                 0.1538910789E+00
      1.6430                 0.4688944216E+00
      0.5976                 0.2149323925E+00
      0.2213                -0.6304127293E+00
      0.0834                -0.3639743687E+00
D    3   1.00
      1.6430                 0.4593061289E+00
      0.5976                 0.5361263152E+00
      0.2213                 0.1682822865E+00
****
O     0
S    15   1.00
 250600.0000                -0.1554856905E-04
  37410.0000                -0.1209385232E-03
   8487.0000                -0.6401536512E-03
   2380.0000                -0.2748753857E-02
    765.1000                -0.1008317716E-01
    271.6000                -0.3266204403E-01
    103.3000                -0.9211196194E-01
     41.5900                -0.2121259840E+00
     17.4400                -0.3599945239E+00
      7.5080                -0.3394356234E+00
      3.1880                -0.9882909067E-01
      1.3280                -0.3023393966E-02
      0.5483                -0.7405694553E-02
      0.2305                -0.1756426788E-02
      0.0967                -0.4267855471E-03
S    15   1.00
 250600.0000                -0.3246131703E+00
  37410.0000                 0.9055169569E+00
   8487.0000                -0.1446777515E+01
   2380.0000                 0.6046498773E+00
    765.1000                 0.3576211666E+00
    271.6000                 0.2211568856E+00
    103.3000                -0.8230551983E+00
     41.5900                -0.7490507295E-01
     17.4400                 0.1007941389E+01
      7.5080                 0.4444754059E-01
      3.1880                -0.1346632290E+01
      1.3280                 0.4510661453E+00
      0.5483                 0.8737272713E+00
      0.2305                -0.5374303894E+00
      0.0967                -0.1971181004E+00
S    15   1.00
 250600.0000                 0.1495934818E+00
  37410.0000                -0.4167093333E+00
   8487.0000                 0.6673422194E+00
   2380.0000                -0.2662495372E+00
    765.1000                -0.1469701279E+00
    271.6000                -0.5221631917E-01
    103.3000                 0.5534919276E+00
     41.5900                 0.3731456774E+00
     17.4400                -0.2152192841E-01
      7.5080                -0.2078565455E+00
      3.1880                -0.1253481544E+01
      1.3280                 0.7931874993E+00
      0.5483                 0.7761822865E+00
      0.2305                -0.6778874100E+00
      0.0967                -0.1189879624E+00
S    15   1.00
 250600.0000                -0.3695005466E-05
  37410.0000                -0.2873934318E-04
   8487.0000                -0.1515886637E-03
   2380.0000                -0.6547315136E-03
    765.1000                -0.2402796750E-02
    271.6000                -0.7926375009E-02
    103.3000                -0.2295911976E-01
     41.5900                -0.5729207519E-01
     17.4400                -0.1142709592E+00
      7.5080                -0.1631038926E+00
      3.1880                -0.4324623372E-01
      1.3280                 0.3209563225E+00
      0.5483                 0.5329843566E+00
      0.2305                 0.2755039880E+00
      0.0967                 0.2118093098E-01
S    15   1.00
 250600.0000                 0.4479451128E-05
  37410.0000                 0.3551171088E-04
   8487.0000                 0.1859246321E-03
   2380.0000                 0.8234428661E-03
    765.1000                 0.2969104858E-02
    271.6000                 0.9968728333E-02
    103.3000                 0.2860112700E-01
     41.5900                 0.7480572942E-01
     17.4400                 0.1502604940E+00
      7.5080                 0.2674578695E+00
      3.1880                -0.4651473575E-01
      1.3280                -0.1369143245E+01
      0.5483                 0.2028380781E+00
      0.2305                 0.9375921143E+00
      0.0967                 0.6886519840E-01
S    15   1.00
 250600.0000                -0.1375797593E+00
  37410.0000                 0.3848871390E+00
   8487.0000                -0.6141823399E+00
   2380.0000                 0.2759647764E+00
    765.1000                 0.1649567222E+00
    271.6000                 0.1492102487E+00
    103.3000                -0.1548052270E+00
     41.5900                 0.2214277563E+00
     17.4400                 0.6979661254E+00
      7.5080                -0.1510698346E+01
      3.1880                -0.7294235430E-01
      1.3280                 0.2283640401E+01
      0.5483                -0.2155899681E+01
      0.2305                -0.2038537246E+00
      0.0967                 0.1153267905E+01
P    15   1.00
 250600.0000                 0.2265978522E-02
  37410.0000                 0.6920810442E-02
   8487.0000                 0.1737677649E-01
   2380.0000                 0.4158396771E-01
    765.1000                 0.8054732821E-01
    271.6000                 0.1663537752E+00
    103.3000                 0.2522486176E+00
     41.5900                 0.3942921618E+00
     17.4400                 0.2228211238E+00
      7.5080                 0.1469758163E+00
      3.1880                 0.6150768698E-01
      1.3280                -0.4950670634E+00
      0.5483                 0.1424644907E+00
      0.2305                 0.1043569056E+00
      0.0967                 0.5256328276E-02
P    15   1.00
 250600.0000                -0.1967644112E-02
  37410.0000                -0.5395367896E-02
   8487.0000                -0.1491665312E-01
   2380.0000                -0.2955023628E-01
    765.1000                -0.7029661450E-01
    271.6000                -0.1105444323E+00
    103.3000                -0.2191251939E+00
     41.5900                -0.1627578958E+00
     17.4400                -0.1582827943E+00
      7.5080                 0.7142791784E+00
      3.1880                 0.7243414776E+00
      1.3280                -0.1262316459E+01
      0.5483                 0.9522709530E-01
      0.2305                 0.3342227194E+00
      0.0967                 0.7282159882E-01
P    15   1.00
 250600.0000                -0.1423819687E-08
  37410.0000                -0.2506399594E-08
   8487.0000                -0.1812827889E-07
   2380.0000                 0.1836779300E-07
    765.1000                 0.2614503605E-07
    271.6000                -0.4050636440E-03
    103.3000                -0.1341592829E-02
     41.5900                -0.6292157513E-02
     17.4400                -0.2128853106E-01
      7.5080                -0.6663623448E-01
      3.1880                -0.1750899274E+00
      1.3280                -0.3164372263E+00
      0.5483                -0.3632478745E+00
      0.2305                -0.2490939261E+00
      0.0967                -0.6753667281E-01
P    15   1.00
 250600.0000                 0.2159953732E-02
  37410.0000                 0.7842580917E-02
   8487.0000                 0.1697062014E-01
   2380.0000                 0.5912438990E-01
    765.1000                 0.6845797321E-01
    271.6000                 0.2346000804E+00
    103.3000                 0.8727765337E-01
     41.5900                 0.4171246212E+00
     17.4400                -0.1234360702E+01
      7.5080                -0.2896195978E-01
      3.1880                 0.1873976116E+01
      1.3280                -0.2495947566E+01
      0.5483                 0.1762148187E+01
      0.2305                -0.2919996529E+00
      0.0967                -0.4423571391E+00
P    15   1.00
 250600.0000                -0.7245480820E-08
  37410.0000                -0.3766290315E-08
   8487.0000                 0.2054800364E-07
   2380.0000                 0.3593226159E-08
    765.1000                -0.1678528950E-08
    271.6000                -0.5549666560E-03
    103.3000                -0.2114375796E-02
     41.5900                -0.8674880260E-02
     17.4400                -0.3330049721E-01
      7.5080                -0.9372214987E-01
      3.1880                -0.3388604367E+00
      1.3280                -0.5401513626E+00
      0.5483                 0.2599941771E+00
      0.2305                 0.5871020097E+00
      0.0967                 0.1607957493E+00
P    15   1.00
 250600.0000                -0.1561836256E-02
  37410.0000                -0.5054436032E-02
   8487.0000                -0.1117710692E-01
   2380.0000                -0.3246978106E-01
    765.1000                -0.4448953619E-01
    271.6000                -0.1360055633E+00
    103.3000                -0.8945765225E-01
     41.5900                -0.2733570676E+00
     17.4400                 0.5175995900E+00
      7.5080                 0.3925678611E+00
      3.1880                -0.6319018768E+00
      1.3280                -0.6577756572E+00
      0.5483                 0.1422149822E+01
      0.2305                -0.2667038925E+00
      0.0967                -0.7231998408E+00
D    10   1.00
    271.6000                 0.1389885664E-07
    103.3000                -0.2342695104E-07
     41.5900                 0.5026473264E-07
     17.4400                -0.5424315322E-07
      7.5080                 0.5559830177E-07
      3.1880                -0.2210523332E+00
      1.3280                -0.4651601504E+00
      0.5483                -0.4639757068E+00
      0.2305                 0.3441281044E-07
      0.0967                -0.3053046592E-07
D    10   1.00
    271.6000                 0.1930249940E-01
    103.3000                 0.3778541097E-01
     41.5900                 0.1171355554E+00
     17.4400                 0.2468298539E+00
      7.5080                 0.5241123797E+00
      3.1880                 0.1362976770E+00
      1.3280                -0.4128484596E+00
      0.5483                -0.2796187813E+00
      0.2305                 0.6107284019E+00
      0.0967                 0.1076185376E+00
D    10   1.00
    271.6000                -0.3831361223E-02
    103.3000                -0.1115904991E-01
     41.5900                -0.2364694859E-01
     17.4400                -0.7147988835E-01
      7.5080                -0.1125077351E+00
      3.1880                -0.3954353431E+00
      1.3280                -0.3680884359E+00
      0.5483                 0.4998019525E+00
      0.2305                 0.4613755650E+00
      0.0967                 0.8162684539E-01
F    3   1.00
      3.1880                -0.3700826290E+00
      1.3280                -0.5026484751E+00
      0.5483                -0.3221511935E+00
****

Hi, would you be able to provide the full output file using the custom basis set for both the 1 thread run and the 8 thread runs?

Sure. Here are the output and timing files for the custom basis set with:

1 thread (custom basis set)

   -----------------------------------------------------------------------
          Psi4: An Open-Source Ab Initio Electronic Structure Package
                               Psi4 1.4rc2.dev95

                         Git: Rev {master} 966d1bd dirty


  ==> Input File <==

--------------------------------------------------------------------------
# cytosine base

molecule {
symmetry c1
no_reorient
no_com
0 1
N 23.1210 22.4390 31.7660
C 22.3100 23.0210 32.7620
H 21.5250 22.3720 33.1180
C 22.4250 24.3390 33.0950
H 21.6430 24.5910 33.7960
C 23.4430 25.0660 32.5260
N 23.6420 26.3220 32.8080
H 24.2640 26.8060 32.1770
H 23.0760 26.9140 33.4000
N 24.2110 24.6170 31.5930
C 24.1520 23.2860 31.2160
O 24.9810 22.8250 30.4420
H 23.0534 21.3905 31.5638
}

memory 50 gb

set basis dzhf-dna

set gradient_write on
grad, wfn = gradient('scf', dft_functional='pbe0', return_wfn=True)
wfn.gradient().print_out()
np.array(grad)

--------------------------------------------------------------------------

  Memory set to  46.566 GiB by Python driver.

Scratch directory: /pscratch/alee/psi4-scratch/
gradient() will perform analytic gradient computation.

*** tstart() called on swa40
*** at Tue Apr 26 17:04:36 2022

   => Loading Basis Set <=

    Name: DZHF-DNA
    Role: ORBITAL
    Keyword: BASIS
    atoms 1, 7, 10      entry N          line   307 file /pscratch/alee/psi4-scratch/dzhf-th1/dzhf-dna.gbs
    atoms 2, 4, 6, 11   entry C          line    76 file /pscratch/alee/psi4-scratch/dzhf-th1/dzhf-dna.gbs
    atoms 3, 5, 8-9, 13 entry H          line     4 file /pscratch/alee/psi4-scratch/dzhf-th1/dzhf-dna.gbs
    atoms 12            entry O          line   538 file /pscratch/alee/psi4-scratch/dzhf-th1/dzhf-dna.gbs


         ---------------------------------------------------------
                                   SCF
               by Justin Turney, Rob Parrish, Andy Simmonett
                          and Daniel G. A. Smith
                              RKS Reference
                        1 Threads,  47683 MiB Core
         ---------------------------------------------------------


  Running in c1 symmetry.

  Rotational constants: A =      0.06503  B =      0.00007  C =      0.00007 [cm^-1]
  Rotational constants: A =   1949.51336  B =      2.11016  C =      2.10946 [MHz]
  Nuclear repulsion =  357.685789289489492

  Charge       = 0
  Multiplicity = 1
  Electrons    = 58
  Nalpha       = 29
  Nbeta        = 29

  ==> Algorithm <==

  SCF Algorithm Type is DF.
  DIIS enabled.
  MOM disabled.
  Fractional occupation disabled.
  Guess Type is SAD.
  Energy threshold   = 1.00e-08
  Density threshold  = 1.00e-08
  Integral threshold = 1.00e-12

  ==> Primary Basis <==

  Basis Set: DZHF-DNA
    Blend: DZHF-DNA
    Number of shells: 163
    Number of basis functions: 453
    Number of Cartesian functions: 506
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         276651
    Total Blocks           =           2060
    Max Points             =            256
    Max Functions          =            453
    Weights Tolerance      =       1.00E-15

   => Loading Basis Set <=

    Name: (DZHF-DNA AUX)
    Role: JKFIT
    Keyword: DF_BASIS_SCF
    atoms 1, 7, 10      entry N          line   258 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 2, 4, 6, 11   entry C          line   198 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 3, 5, 8-9, 13 entry H          line    18 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 12            entry O          line   318 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs

  ==> Integral Setup <==

  DFHelper Memory: AOs need 1.114 GiB; user supplied 31.252 GiB. Using in-core AOs.

  ==> MemDFJK: Density-Fitted J/K Matrices <==

    J tasked:                   Yes
    K tasked:                   Yes
    wK tasked:                   No
    OpenMP threads:               1
    Memory [MiB]:             32002
    Algorithm:                 Core
    Schwarz Cutoff:           1E-12
    Mask sparsity (%):       0.0000
    Fitting Condition:        1E-10

   => Auxiliary Basis Set <=

  Basis Set: (DZHF-DNA AUX)
    Blend: DEF2-UNIVERSAL-JKFIT
    Number of shells: 230
    Number of basis functions: 698
    Number of Cartesian functions: 828
    Spherical Harmonics?: true
    Max angular momentum: 4

  Cached 100.0% of DFT collocation blocks in 3.672 [GiB].

  Minimum eigenvalue in the overlap matrix is 6.1255002001E-05.
  Reciprocal condition number of the overlap matrix is 7.7400220565E-06.
    Using symmetric orthogonalization.

  ==> Pre-Iterations <==

  SCF Guess: Superposition of Atomic Densities via on-the-fly atomic UHF (no occupation information).

   -------------------------
    Irrep   Nso     Nmo
   -------------------------
     A        453     453
   -------------------------
    Total     453     453
   -------------------------

  ==> Iterations <==

                           Total Energy        Delta E     RMS |[F,P]|

   @DF-RKS iter SAD:  -394.37166194401414   -3.94372e+02   0.00000e+00
   @DF-RKS iter   1:  -394.12304543009782    2.48617e-01   2.44905e-03 DIIS
   @DF-RKS iter   2:  -393.76979301384273    3.53252e-01   3.08996e-03 DIIS
   @DF-RKS iter   3:  -394.60588868932410   -8.36096e-01   6.67341e-04 DIIS
   @DF-RKS iter   4:  -394.64407783834639   -3.81891e-02   2.29477e-04 DIIS
   @DF-RKS iter   5:  -394.64827927705790   -4.20144e-03   6.75910e-05 DIIS
   @DF-RKS iter   6:  -394.64862424220996   -3.44965e-04   2.70017e-05 DIIS
   @DF-RKS iter   7:  -394.64868070972648   -5.64675e-05   7.91622e-06 DIIS
   @DF-RKS iter   8:  -394.64868562266821   -4.91294e-06   2.60401e-06 DIIS
   @DF-RKS iter   9:  -394.64868612542415   -5.02756e-07   7.49861e-07 DIIS
   @DF-RKS iter  10:  -394.64868616997938   -4.45552e-08   3.27097e-07 DIIS
   @DF-RKS iter  11:  -394.64868617920655   -9.22716e-09   1.08785e-07 DIIS
   @DF-RKS iter  12:  -394.64868618035570   -1.14915e-09   3.35729e-08 DIIS
   @DF-RKS iter  13:  -394.64868618046160   -1.05899e-10   8.35935e-09 DIIS
  Energy and wave function converged.


  ==> Post-Iterations <==

   Electrons on quadrature grid:
      Ntotal   =   57.9999803838 ; deviation = -1.962e-05



  @DF-RKS Final Energy:  -394.64868618046160

   => Energetics <=

    Nuclear Repulsion Energy =            357.6857892894894917
    One-Electron Energy =               -1245.1083936347306462
    Two-Electron Energy =                 532.6583577940101577
    DFT Exchange-Correlation Energy =     -39.8844396292305916
    Empirical Dispersion Energy =           0.0000000000000000
    VV10 Nonlocal Energy =                  0.0000000000000000
    Total Energy =                       -394.6486861804615387

Computation Completed


Properties will be evaluated at   0.000000,   0.000000,   0.000000 [a0]

Properties computed using the SCF density matrix

  Nuclear Dipole Moment: [e a0]
     X:  2577.9784      Y:  2631.6165      Z:  3511.7797

  Electronic Dipole Moment: [e a0]
     X: -2579.9171      Y: -2631.1144      Z: -3510.1197

  Dipole Moment: [e a0]
     X:    -1.9388      Y:     0.5022      Z:     1.6600     Total:     2.6013

  Dipole Moment: [D]
     X:    -4.9279      Y:     1.2764      Z:     4.2193     Total:     6.6118


*** tstop() called on swa40 at Tue Apr 26 17:16:07 2022
Module time:
        user time   =     684.01 seconds =      11.40 minutes
        system time =       4.06 seconds =       0.07 minutes
        total time  =        691 seconds =      11.52 minutes
Total time:
        user time   =     684.01 seconds =      11.40 minutes
        system time =       4.06 seconds =       0.07 minutes
        total time  =        691 seconds =      11.52 minutes

*** tstart() called on swa40
*** at Tue Apr 26 17:16:07 2022


         ------------------------------------------------------------
                                   SCF GRAD
                          Rob Parrish, Justin Turney,
                       Andy Simmonett, and Alex Sokolov
         ------------------------------------------------------------


  ==> Basis Set <==

  Basis Set: DZHF-DNA
    Blend: DZHF-DNA
    Number of shells: 163
    Number of basis functions: 453
    Number of Cartesian functions: 506
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFJKGrad: Density-Fitted SCF Gradients <==

    Gradient:                    1
    J tasked:                  Yes
    K tasked:                  Yes
    wK tasked:                  No
    OpenMP threads:              1
    Integrals threads:           1
    Memory [MiB]:            35762
    Schwarz Cutoff:          1E-12
    Fitting Condition:       1E-10

   => Auxiliary Basis Set <=

  Basis Set: (DZHF-DNA AUX)
    Blend: DEF2-UNIVERSAL-JKFIT
    Number of shells: 230
    Number of basis functions: 698
    Number of Cartesian functions: 828
    Spherical Harmonics?: true
    Max angular momentum: 4

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         276651
    Total Blocks           =           2060
    Max Points             =            256
    Max Functions          =            453
    Weights Tolerance      =       1.00E-15


  -Total Gradient:
     Atom            X                  Y                   Z
    ------   -----------------  -----------------  -----------------
       1        0.013800539850    -0.009722056290    -0.029338900141
       2       -0.023527248914     0.014133967518     0.046188434212
       3        0.001103490971    -0.001106328264    -0.005607191757
       4        0.043299649401     0.065113963850    -0.026260848719
       5       -0.007393014032    -0.019057782565    -0.003472751364
       6        0.000394335315    -0.002121891030     0.019718027770
       7       -0.006865419249    -0.068071265757    -0.016534772430
       8       -0.000528424654     0.002191232130    -0.004175161561
       9        0.000135253923     0.011750370875     0.002400054805
      10       -0.008111034507     0.044414811978     0.017481194488
      11       -0.026150676296     0.009827997390     0.017160275216
      12        0.022742607330     0.001493447953    -0.012671988309
      13       -0.008902289070    -0.048857812321    -0.004888420705


*** tstop() called on swa40 at Tue Apr 26 17:28:46 2022
Module time:
        user time   =     755.92 seconds =      12.60 minutes
        system time =       1.41 seconds =       0.02 minutes
        total time  =        759 seconds =      12.65 minutes
Total time:
        user time   =    1439.94 seconds =      24.00 minutes
        system time =       5.47 seconds =       0.09 minutes
        total time  =       1450 seconds =      24.17 minutes

Wall Time:     1452.30 seconds

                                                       Time (seconds)
Module                                       User      System        Wall        Calls
V: Grid                             :      0.833u      0.033s      0.523w      1 calls
DFH: sparsity prep                  :    496.283u      0.450s    298.778w      1 calls
Libint2ERI::Libint2ERI              :   1268.900u      0.767s    763.645w      7 calls
DFH: initialize()                   :    425.167u      1.450s    256.691w      1 calls
DFH: AO Construction                :    113.783u      0.383s     68.660w      1 calls
DFH: AO-Met. Contraction            :      3.417u      0.767s      2.515w      1 calls
HF: Form core H                     :     15.050u      0.017s      9.115w      1 calls
HF: Form S/X                        :      0.067u      0.000s      0.042w      1 calls
HF: Guess                           :     18.683u      0.517s     11.715w      1 calls
SAD Guess                           :     18.667u      0.517s     11.711w      1 calls
HF: Form G                          :    175.183u      1.433s    106.218w     14 calls
RV: Form V                          :    166.133u      0.567s    100.301w     14 calls
Properties                          :     66.483u      0.383s     40.773w  30900 calls
Functional                          :      4.933u      0.050s      2.938w  30900 calls
V_xc                                :    106.767u      0.183s     63.936w  28840 calls
JK: D                               :      0.033u      0.000s      0.005w     14 calls
JK: USO2AO                          :      0.017u      0.000s      0.005w     14 calls
JK: JK                              :      8.983u      0.850s      5.888w     14 calls
DFH: compute_JK()                   :      8.983u      0.850s      5.888w     14 calls
DFH: Grabbing AOs                   :      0.000u      0.000s      0.000w     14 calls
DFH: compute_J                      :      2.317u      0.000s      1.403w     14 calls
DFH: compute_K                      :      6.550u      0.000s      3.933w     14 calls
JK: AO2USO                          :      0.000u      0.000s      0.000w     14 calls
HF: Form F                          :      0.033u      0.000s      0.008w     14 calls
HF: Form D                          :      0.017u      0.000s      0.004w     14 calls
HF: DIIS                            :      0.317u      0.100s      0.349w     13 calls
DIISManager::add_entry              :      0.000u      0.017s      0.126w     13 calls
DIISManager::extrapolate            :      0.100u      0.083s      0.079w     13 calls
bMatrix setup                       :      0.050u      0.033s      0.044w     13 calls
bMatrix pseudoinverse               :      0.000u      0.000s      0.001w     13 calls
New vector                          :      0.050u      0.050s      0.035w     13 calls
HF: Form C                          :      0.650u      0.000s      0.393w     13 calls
Grad: V T Perturb                   :     80.017u      0.000s     48.130w      1 calls
Grad: S                             :      0.450u      0.000s      0.267w      1 calls
Grad: JK                            :   1156.850u      2.300s    697.603w      1 calls
JKGrad: Amn                         :    453.183u      0.900s    273.142w      1 calls
JKGrad: Awmn                        :      0.000u      0.000s      0.000w      1 calls
JKGrad: AB                          :      0.183u      0.017s      0.122w      1 calls
JKGrad: UV                          :      0.017u      0.000s      0.061w      1 calls
JKGrad: ABx                         :      0.050u      0.017s      0.038w      1 calls
JKGrad: Amnx                        :    703.417u      1.367s    423.904w      1 calls
Grad: XC                            :     22.517u      0.050s     13.670w      1 calls
V_xc gradient                       :     10.050u      0.000s      6.042w   2060 calls

--------------------------------------------------------------------------------------
V: Grid                             :      0.833u      0.033s      0.523w      1 calls
DFH: sparsity prep                  :    496.283u      0.450s    298.778w      1 calls
| Libint2ERI::Libint2ERI            :    309.683u      0.367s    186.514w      1 calls
DFH: initialize()                   :    425.167u      1.450s    256.691w      1 calls
| Libint2ERI::Libint2ERI            :    307.800u      0.217s    185.247w      2 calls
| DFH: AO Construction              :    113.783u      0.383s     68.660w      1 calls
| DFH: AO-Met. Contraction          :      3.417u      0.767s      2.515w      1 calls
HF: Form core H                     :     15.050u      0.017s      9.115w      1 calls
HF: Form S/X                        :      0.067u      0.000s      0.042w      1 calls
HF: Guess                           :     18.683u      0.517s     11.715w      1 calls
| SAD Guess                         :     18.667u      0.517s     11.711w      1 calls
HF: Form G                          :    175.183u      1.433s    106.218w     14 calls
| RV: Form V                        :    166.133u      0.567s    100.301w     14 calls
| | Properties                      :     54.283u      0.333s     33.453w  28840 calls
| | Functional                      :      4.717u      0.050s      2.737w  28840 calls
| | V_xc                            :    106.767u      0.183s     63.936w  28840 calls
| JK: D                             :      0.033u      0.000s      0.005w     14 calls
| JK: USO2AO                        :      0.017u      0.000s      0.005w     14 calls
| JK: JK                            :      8.983u      0.850s      5.888w     14 calls
| | DFH: compute_JK()               :      8.983u      0.850s      5.888w     14 calls
| | | DFH: Grabbing AOs             :      0.000u      0.000s      0.000w     14 calls
| | | DFH: compute_J                :      2.317u      0.000s      1.403w     14 calls
| | | DFH: compute_K                :      6.550u      0.000s      3.933w     14 calls
| JK: AO2USO                        :      0.000u      0.000s      0.000w     14 calls
HF: Form F                          :      0.033u      0.000s      0.008w     14 calls
HF: Form D                          :      0.017u      0.000s      0.004w     14 calls
HF: DIIS                            :      0.317u      0.100s      0.349w     13 calls
| DIISManager::add_entry            :      0.000u      0.017s      0.126w     13 calls
| DIISManager::extrapolate          :      0.100u      0.083s      0.079w     13 calls
| | bMatrix setup                   :      0.050u      0.033s      0.044w     13 calls
| | bMatrix pseudoinverse           :      0.000u      0.000s      0.001w     13 calls
| | New vector                      :      0.050u      0.050s      0.035w     13 calls
HF: Form C                          :      0.650u      0.000s      0.393w     13 calls
Grad: V T Perturb                   :     80.017u      0.000s     48.130w      1 calls
Grad: S                             :      0.450u      0.000s      0.267w      1 calls
Grad: JK                            :   1156.850u      2.300s    697.603w      1 calls
| JKGrad: Amn                       :    453.183u      0.900s    273.142w      1 calls
| | Libint2ERI::Libint2ERI          :    340.817u      0.033s    204.995w      1 calls
| JKGrad: Awmn                      :      0.000u      0.000s      0.000w      1 calls
| JKGrad: AB                        :      0.183u      0.017s      0.122w      1 calls
| | Libint2ERI::Libint2ERI          :      0.000u      0.000s      0.000w      1 calls
| JKGrad: UV                        :      0.017u      0.000s      0.061w      1 calls
| JKGrad: ABx                       :      0.050u      0.017s      0.038w      1 calls
| | Libint2ERI::Libint2ERI          :      0.000u      0.000s      0.000w      1 calls
| JKGrad: Amnx                      :    703.417u      1.367s    423.904w      1 calls
| | Libint2ERI::Libint2ERI          :    310.600u      0.150s    186.889w      1 calls
Grad: XC                            :     22.517u      0.050s     13.670w      1 calls
| Properties                        :     12.200u      0.050s      7.320w   2060 calls
| Functional                        :      0.217u      0.000s      0.202w   2060 calls
| V_xc gradient                     :     10.050u      0.000s      6.042w   2060 calls

**************************************************************************************

For 8 threads (custom basis set):

    -----------------------------------------------------------------------
          Psi4: An Open-Source Ab Initio Electronic Structure Package
                               Psi4 1.4rc2.dev95

                         Git: Rev {master} 966d1bd dirty

  ==> Input File <==

--------------------------------------------------------------------------
# cytosine base

molecule {
symmetry c1
no_reorient
no_com
0 1
N 23.1210 22.4390 31.7660
C 22.3100 23.0210 32.7620
H 21.5250 22.3720 33.1180
C 22.4250 24.3390 33.0950
H 21.6430 24.5910 33.7960
C 23.4430 25.0660 32.5260
N 23.6420 26.3220 32.8080
H 24.2640 26.8060 32.1770
H 23.0760 26.9140 33.4000
N 24.2110 24.6170 31.5930
C 24.1520 23.2860 31.2160
O 24.9810 22.8250 30.4420
H 23.0534 21.3905 31.5638
}

memory 50 gb

set basis dzhf-dna

set gradient_write on
grad, wfn = gradient('scf', dft_functional='pbe0', return_wfn=True)
wfn.gradient().print_out()
np.array(grad)

--------------------------------------------------------------------------

  Memory set to  46.566 GiB by Python driver.

Scratch directory: /pscratch/alee/psi4-scratch/
gradient() will perform analytic gradient computation.

*** tstart() called on swa12
*** at Tue Apr 26 17:20:50 2022

   => Loading Basis Set <=

    Name: DZHF-DNA
    Role: ORBITAL
    Keyword: BASIS
    atoms 1, 7, 10      entry N          line   307 file /pscratch/alee/psi4-scratch/dzhf-th8/dzhf-dna.gbs
    atoms 2, 4, 6, 11   entry C          line    76 file /pscratch/alee/psi4-scratch/dzhf-th8/dzhf-dna.gbs
    atoms 3, 5, 8-9, 13 entry H          line     4 file /pscratch/alee/psi4-scratch/dzhf-th8/dzhf-dna.gbs
    atoms 12            entry O          line   538 file /pscratch/alee/psi4-scratch/dzhf-th8/dzhf-dna.gbs


         ---------------------------------------------------------
                                   SCF
               by Justin Turney, Rob Parrish, Andy Simmonett
                          and Daniel G. A. Smith
                              RKS Reference
                        8 Threads,  47683 MiB Core
         ---------------------------------------------------------


  Charge       = 0
  Multiplicity = 1
  Electrons    = 58
  Nalpha       = 29
  Nbeta        = 29

  ==> Algorithm <==

  SCF Algorithm Type is DF.
  DIIS enabled.
  MOM disabled.
  Fractional occupation disabled.
  Guess Type is SAD.
  Energy threshold   = 1.00e-08
  Density threshold  = 1.00e-08
  Integral threshold = 1.00e-12

  ==> Primary Basis <==

  Basis Set: DZHF-DNA
    Blend: DZHF-DNA
    Number of shells: 163
    Number of basis functions: 453
    Number of Cartesian functions: 506
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         276651
    Total Blocks           =           2060
    Max Points             =            256
    Max Functions          =            453
    Weights Tolerance      =       1.00E-15

   => Loading Basis Set <=

    Name: (DZHF-DNA AUX)
    Role: JKFIT
    Keyword: DF_BASIS_SCF
    atoms 1, 7, 10      entry N          line   258 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 2, 4, 6, 11   entry C          line   198 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 3, 5, 8-9, 13 entry H          line    18 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs
    atoms 12            entry O          line   318 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/def2-universal-jkfit.gbs

  ==> Integral Setup <==

  DFHelper Memory: AOs need 1.124 GiB; user supplied 31.252 GiB. Using in-core AOs.

  ==> MemDFJK: Density-Fitted J/K Matrices <==

    J tasked:                   Yes
    K tasked:                   Yes
    wK tasked:                   No
    OpenMP threads:               8
    Memory [MiB]:             32002
    Algorithm:                 Core
    Schwarz Cutoff:           1E-12
    Mask sparsity (%):       0.0000
    Fitting Condition:        1E-10

   => Auxiliary Basis Set <=

  Basis Set: (DZHF-DNA AUX)
    Blend: DEF2-UNIVERSAL-JKFIT
    Number of shells: 230
    Number of basis functions: 698
    Number of Cartesian functions: 828
    Spherical Harmonics?: true
    Max angular momentum: 4

  Cached 100.0% of DFT collocation blocks in 3.672 [GiB].

  Minimum eigenvalue in the overlap matrix is 6.1255002000E-05.
  Reciprocal condition number of the overlap matrix is 7.7400220565E-06.
    Using symmetric orthogonalization.

  ==> Pre-Iterations <==

  SCF Guess: Superposition of Atomic Densities via on-the-fly atomic UHF (no occupation information).

   -------------------------
    Irrep   Nso     Nmo
   -------------------------
     A        453     453
   -------------------------
    Total     453     453
   -------------------------

  ==> Iterations <==

                           Total Energy        Delta E     RMS |[F,P]|

   @DF-RKS iter SAD:  -394.37166194383354   -3.94372e+02   0.00000e+00
   @DF-RKS iter   1:  -394.12304542991888    2.48617e-01   2.44905e-03 DIIS
   @DF-RKS iter   2:  -393.76979301365367    3.53252e-01   3.08996e-03 DIIS
   @DF-RKS iter   3:  -394.60588868912743   -8.36096e-01   6.67341e-04 DIIS
   @DF-RKS iter   4:  -394.64407783814875   -3.81891e-02   2.29477e-04 DIIS
   @DF-RKS iter   5:  -394.64827927686258   -4.20144e-03   6.75910e-05 DIIS
   @DF-RKS iter   6:  -394.64862424201175   -3.44965e-04   2.70017e-05 DIIS
   @DF-RKS iter   7:  -394.64868070952741   -5.64675e-05   7.91622e-06 DIIS
   @DF-RKS iter   8:  -394.64868562246966   -4.91294e-06   2.60401e-06 DIIS
   @DF-RKS iter   9:  -394.64868612522582   -5.02756e-07   7.49861e-07 DIIS
   @DF-RKS iter  10:  -394.64868616978083   -4.45550e-08   3.27097e-07 DIIS
   @DF-RKS iter  11:  -394.64868617900862   -9.22779e-09   1.08785e-07 DIIS
   @DF-RKS iter  12:  -394.64868618015771   -1.14909e-09   3.35729e-08 DIIS
   @DF-RKS iter  13:  -394.64868618026236   -1.04649e-10   8.35935e-09 DIIS
  Energy and wave function converged.


  @DF-RKS Final Energy:  -394.64868618026236

   => Energetics <=

    Nuclear Repulsion Energy =            357.6857892894894917
    One-Electron Energy =               -1245.1083936344500671
    Two-Electron Energy =                 532.6583577938979488
    DFT Exchange-Correlation Energy =     -39.8844396291998322
    Empirical Dispersion Energy =           0.0000000000000000
    VV10 Nonlocal Energy =                  0.0000000000000000
    Total Energy =                       -394.6486861802624162

Computation Completed


Properties will be evaluated at   0.000000,   0.000000,   0.000000 [a0]

Properties computed using the SCF density matrix

  Nuclear Dipole Moment: [e a0]
     X:  2577.9784      Y:  2631.6165      Z:  3511.7797

  Electronic Dipole Moment: [e a0]
     X: -2579.9171      Y: -2631.1144      Z: -3510.1197

  Dipole Moment: [e a0]
     X:    -1.9388      Y:     0.5022      Z:     1.6600     Total:     2.6013

  Dipole Moment: [D]
     X:    -4.9279      Y:     1.2764      Z:     4.2193     Total:     6.6118


*** tstop() called on swa12 at Tue Apr 26 17:29:50 2022
Module time:
        user time   =     809.34 seconds =      13.49 minutes
        system time =      12.01 seconds =       0.20 minutes
        total time  =        540 seconds =       9.00 minutes
Total time:
        user time   =     809.34 seconds =      13.49 minutes
        system time =      12.01 seconds =       0.20 minutes
        total time  =        540 seconds =       9.00 minutes

*** tstart() called on swa12
*** at Tue Apr 26 17:29:50 2022


         ------------------------------------------------------------
                                   SCF GRAD
                          Rob Parrish, Justin Turney,
                       Andy Simmonett, and Alex Sokolov
         ------------------------------------------------------------

  ==> Basis Set <==

  Basis Set: DZHF-DNA
    Blend: DZHF-DNA
    Number of shells: 163
    Number of basis functions: 453
    Number of Cartesian functions: 506
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFJKGrad: Density-Fitted SCF Gradients <==

    Gradient:                    1
    J tasked:                  Yes
    K tasked:                  Yes
    wK tasked:                  No
    OpenMP threads:              8
    Integrals threads:           8
    Memory [MiB]:            35762
    Schwarz Cutoff:          1E-12
    Fitting Condition:       1E-10

   => Auxiliary Basis Set <=

  Basis Set: (DZHF-DNA AUX)
    Blend: DEF2-UNIVERSAL-JKFIT
    Number of shells: 230
    Number of basis functions: 698
    Number of Cartesian functions: 828
    Spherical Harmonics?: true
    Max angular momentum: 4

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         276651
    Total Blocks           =           2060
    Max Points             =            256
    Max Functions          =            453
    Weights Tolerance      =       1.00E-15


  -Total Gradient:
     Atom            X                  Y                   Z
    ------   -----------------  -----------------  -----------------
       1        0.013800528124    -0.009722056537    -0.029338900770
       2       -0.023527274607     0.014133935827     0.046188459703
       3        0.001103487343    -0.001106331106    -0.005607192048
       4        0.043299667177     0.065113998998    -0.026260856268
       5       -0.007393031544    -0.019057777875    -0.003472738355
       6        0.000394326830    -0.002121916244     0.019718034578
       7       -0.006865413763    -0.068071278674    -0.016534763856
       8       -0.000528405776     0.002191249609    -0.004175184359
       9        0.000135234446     0.011750393629     0.002400074007
      10       -0.008111003968     0.044414858175     0.017481176892
      11       -0.026150724053     0.009828019836     0.017160320829
      12        0.022742670505     0.001493409432    -0.012672048791
      13       -0.008902290647    -0.048857849608    -0.004888430060


*** tstop() called on swa12 at Tue Apr 26 18:20:33 2022
Module time:
        user time   =    3412.75 seconds =      56.88 minutes
        system time =      14.06 seconds =       0.23 minutes
        total time  =       3043 seconds =      50.72 minutes
Total time:
        user time   =    4222.11 seconds =      70.37 minutes
        system time =      26.07 seconds =       0.43 minutes
        total time  =       3583 seconds =      59.72 minutes
Wall Time:     3584.66 seconds

                                                       Time (seconds)
Module                                       User      System        Wall        Calls
V: Grid                             :      2.917u      0.083s      0.234w      1 calls
DFH: sparsity prep                  :    499.583u      0.467s    300.766w      1 calls
Libint2ERI::Libint2ERI              :   5577.683u     13.983s   3359.746w     42 calls
DFH: initialize()                   :    447.350u      3.783s    196.689w      1 calls
DFH: AO Construction                :    127.033u      0.583s      9.595w      1 calls
DFH: AO-Met. Contraction            :      6.883u      2.000s      0.718w      1 calls
HF: Form core H                     :     26.167u      0.083s      1.981w      1 calls
HF: Form S/X                        :      0.533u      0.017s      0.041w      1 calls
HF: Guess                           :     32.417u      2.817s     12.055w      1 calls
SAD Guess                           :     32.350u      2.817s     12.051w      1 calls
HF: Form G                          :    317.417u      7.633s     24.433w     14 calls
RV: Form V                          :    294.833u      3.900s     22.458w     14 calls
Properties                          :     60.354p                          30900 calls
Functional                          :      3.567p                          30900 calls
V_xc                                :    129.679p                          28840 calls
JK: D                               :      0.050u      0.017s      0.005w     14 calls
JK: USO2AO                          :      0.100u      0.017s      0.009w     14 calls
JK: JK                              :     22.117u      3.683s      1.943w     14 calls
DFH: compute_JK()                   :     22.117u      3.683s      1.943w     14 calls
DFH: Grabbing AOs                   :      0.000u      0.000s      0.000w     14 calls
DFH: compute_J                      :      5.517u      2.600s      0.610w     14 calls
DFH: compute_K                      :      9.383u      0.000s      0.706w     14 calls
JK: AO2USO                          :      0.000u      0.000s      0.000w     14 calls
HF: Form F                          :      0.133u      0.017s      0.009w     14 calls
HF: Form D                          :      0.017u      0.000s      0.005w     14 calls
HF: DIIS                            :      2.983u      0.333s      0.256w     13 calls
DIISManager::add_entry              :      0.883u      0.117s      0.079w     13 calls
DIISManager::extrapolate            :      1.267u      0.183s      0.112w     13 calls
bMatrix setup                       :      0.783u      0.150s      0.068w     13 calls
bMatrix pseudoinverse               :      0.000u      0.000s      0.001w     13 calls
New vector                          :      0.483u      0.033s      0.043w     13 calls
HF: Form C                          :      5.450u      0.100s      0.414w     13 calls
Grad: V T Perturb                   :     93.050u      0.133s      7.006w      1 calls
Grad: S                             :      0.517u      0.000s      0.038w      1 calls
Grad: JK                            :   5552.417u     22.717s   3032.329w      1 calls
JKGrad: Amn                         :   2617.317u     10.317s   1509.183w      1 calls
JKGrad: Awmn                        :      0.000u      0.000s      0.000w      1 calls
JKGrad: AB                          :      1.150u      0.033s      0.096w      1 calls
JKGrad: UV                          :      0.100u      0.017s      0.009w      1 calls
JKGrad: ABx                         :      0.250u      0.000s      0.019w      1 calls
JKGrad: Amnx                        :   2932.983u     12.267s   1522.763w      1 calls
Grad: XC                            :     41.867u      0.583s      3.290w      1 calls
V_xc gradient                       :     10.708p                           2060 calls

--------------------------------------------------------------------------------------
V: Grid                             :      2.917u      0.083s      0.234w      1 calls
DFH: sparsity prep                  :    499.583u      0.467s    300.766w      1 calls
| Libint2ERI::Libint2ERI            :    312.850u      0.400s    188.419w      1 calls
DFH: initialize()                   :    447.350u      3.783s    196.689w      1 calls
| Libint2ERI::Libint2ERI            :    310.500u      0.250s    185.961w      9 calls
| DFH: AO Construction              :    127.033u      0.583s      9.595w      1 calls
| DFH: AO-Met. Contraction          :      6.883u      2.000s      0.718w      1 calls
HF: Form core H                     :     26.167u      0.083s      1.981w      1 calls
HF: Form S/X                        :      0.533u      0.017s      0.041w      1 calls
HF: Guess                           :     32.417u      2.817s     12.055w      1 calls
| SAD Guess                         :     32.350u      2.817s     12.051w      1 calls
HF: Form G                          :    317.417u      7.633s     24.433w     14 calls
| RV: Form V                        :    294.833u      3.900s     22.458w     14 calls
| | Properties                      :     45.955p                          28840 calls
| | Functional                      :      3.327p                          28840 calls
| | V_xc                            :    129.679p                          28840 calls
| JK: D                             :      0.050u      0.017s      0.005w     14 calls
| JK: USO2AO                        :      0.100u      0.017s      0.009w     14 calls
| JK: JK                            :     22.117u      3.683s      1.943w     14 calls
| | DFH: compute_JK()               :     22.117u      3.683s      1.943w     14 calls
| | | DFH: Grabbing AOs             :      0.000u      0.000s      0.000w     14 calls
| | | DFH: compute_J                :      5.517u      2.600s      0.610w     14 calls
| | | DFH: compute_K                :      9.383u      0.000s      0.706w     14 calls
| JK: AO2USO                        :      0.000u      0.000s      0.000w     14 calls
HF: Form F                          :      0.133u      0.017s      0.009w     14 calls
HF: Form D                          :      0.017u      0.000s      0.005w     14 calls
HF: DIIS                            :      2.983u      0.333s      0.256w     13 calls
| DIISManager::add_entry            :      0.883u      0.117s      0.079w     13 calls
| DIISManager::extrapolate          :      1.267u      0.183s      0.112w     13 calls
| | bMatrix setup                   :      0.783u      0.150s      0.068w     13 calls
| | bMatrix pseudoinverse           :      0.000u      0.000s      0.001w     13 calls
| | New vector                      :      0.483u      0.033s      0.043w     13 calls
HF: Form C                          :      5.450u      0.100s      0.414w     13 calls
Grad: V T Perturb                   :     93.050u      0.133s      7.006w      1 calls
Grad: S                             :      0.517u      0.000s      0.038w      1 calls
Grad: JK                            :   5552.417u     22.717s   3032.329w      1 calls
| JKGrad: Amn                       :   2617.317u     10.317s   1509.183w      1 calls
| | Libint2ERI::Libint2ERI          :   2486.067u      7.200s   1498.556w      8 calls
| JKGrad: Awmn                      :      0.000u      0.000s      0.000w      1 calls
| JKGrad: AB                        :      1.150u      0.033s      0.096w      1 calls
| | Libint2ERI::Libint2ERI          :      0.000u      0.000s      0.003w      8 calls
| JKGrad: UV                        :      0.100u      0.017s      0.009w      1 calls
| JKGrad: ABx                       :      0.250u      0.000s      0.019w      1 calls
| | Libint2ERI::Libint2ERI          :      0.033u      0.000s      0.002w      8 calls
| JKGrad: Amnx                      :   2932.983u     12.267s   1522.763w      1 calls
| | Libint2ERI::Libint2ERI          :   2468.233u      6.133s   1486.804w      8 calls
Grad: XC                            :     41.867u      0.583s      3.290w      1 calls
| Properties                        :     14.399p                           2060 calls
| Functional                        :      0.240p                           2060 calls
| V_xc gradient                     :     10.708p                           2060 calls

**************************************************************************************

And here is 8 threads with the aug-cc-pvtz basis set (no issues):

    -----------------------------------------------------------------------
          Psi4: An Open-Source Ab Initio Electronic Structure Package
                               Psi4 1.4rc2.dev95

                         Git: Rev {master} 966d1bd dirty

  ==> Input File <==

--------------------------------------------------------------------------
# cytosine base

molecule {
symmetry c1
no_reorient
no_com
0 1
N 23.1210 22.4390 31.7660
C 22.3100 23.0210 32.7620
H 21.5250 22.3720 33.1180
C 22.4250 24.3390 33.0950
H 21.6430 24.5910 33.7960
C 23.4430 25.0660 32.5260
N 23.6420 26.3220 32.8080
H 24.2640 26.8060 32.1770
H 23.0760 26.9140 33.4000
N 24.2110 24.6170 31.5930
C 24.1520 23.2860 31.2160
O 24.9810 22.8250 30.4420
H 23.0534 21.3905 31.5638
}

memory 50 gb

set basis aug-cc-pvtz

set gradient_write on
grad, wfn = gradient('scf', dft_functional='pbe0', return_wfn=True)
wfn.gradient().print_out()
np.array(grad)

--------------------------------------------------------------------------

  Memory set to  46.566 GiB by Python driver.

Scratch directory: /pscratch/alee/psi4-scratch/
gradient() will perform analytic gradient computation.

*** tstart() called on swa20
*** at Tue Apr 26 16:55:59 2022

   => Loading Basis Set <=

    Name: AUG-CC-PVTZ
    Role: ORBITAL
    Keyword: BASIS
    atoms 1, 7, 10      entry N          line   285 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz.gbs
    atoms 2, 4, 6, 11   entry C          line   239 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz.gbs
    atoms 3, 5, 8-9, 13 entry H          line    40 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz.gbs
    atoms 12            entry O          line   331 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz.gbs


         ---------------------------------------------------------
                                   SCF
               by Justin Turney, Rob Parrish, Andy Simmonett
                          and Daniel G. A. Smith
                              RKS Reference
                        8 Threads,  47683 MiB Core
         ---------------------------------------------------------

  Charge       = 0
  Multiplicity = 1
  Electrons    = 58
  Nalpha       = 29
  Nbeta        = 29

  ==> Algorithm <==

  SCF Algorithm Type is DF.
  DIIS enabled.
  MOM disabled.
  Fractional occupation disabled.
  Guess Type is SAD.
  Energy threshold   = 1.00e-08
  Density threshold  = 1.00e-08
  Integral threshold = 1.00e-12

  ==> Primary Basis <==

  Basis Set: AUG-CC-PVTZ
    Blend: AUG-CC-PVTZ
    Number of shells: 157
    Number of basis functions: 483
    Number of Cartesian functions: 565
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         277283
    Total Blocks           =           2066
    Max Points             =            256
    Max Functions          =            472
    Weights Tolerance      =       1.00E-15

   => Loading Basis Set <=

    Name: (AUG-CC-PVTZ AUX)
    Role: JKFIT
    Keyword: DF_BASIS_SCF
    atoms 1, 7, 10      entry N          line   224 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz-jkfit.gbs
    atoms 2, 4, 6, 11   entry C          line   162 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz-jkfit.gbs
    atoms 3, 5, 8-9, 13 entry H          line    70 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz-jkfit.gbs
    atoms 12            entry O          line   286 file /home/alee/codes/psi4_mod/objdir/stage/share/psi4/basis/aug-cc-pvtz-jkfit.gbs

  ==> Integral Setup <==

  DFHelper Memory: AOs need 1.862 GiB; user supplied 31.854 GiB. Using in-core AOs.

  ==> MemDFJK: Density-Fitted J/K Matrices <==

    J tasked:                   Yes
    K tasked:                   Yes
    wK tasked:                   No
    OpenMP threads:               8
    Memory [MiB]:             32618
    Algorithm:                 Core
    Schwarz Cutoff:           1E-12
    Mask sparsity (%):       2.8617
    Fitting Condition:        1E-10

   => Auxiliary Basis Set <=

  Basis Set: (AUG-CC-PVTZ AUX)
    Blend: AUG-CC-PVTZ-JKFIT
    Number of shells: 310
    Number of basis functions: 1062
    Number of Cartesian functions: 1323
    Spherical Harmonics?: true
    Max angular momentum: 4

  Cached 100.0% of DFT collocation blocks in 3.071 [GiB].

  Minimum eigenvalue in the overlap matrix is 3.2169881909E-06.
  Reciprocal condition number of the overlap matrix is 1.9951619896E-07.
    Using symmetric orthogonalization.

  ==> Pre-Iterations <==

  SCF Guess: Superposition of Atomic Densities via on-the-fly atomic UHF (no occupation information).

   -------------------------
    Irrep   Nso     Nmo
   -------------------------
     A        483     483
   -------------------------
    Total     483     483
   -------------------------

  ==> Iterations <==

                           Total Energy        Delta E     RMS |[F,P]|

   @DF-RKS iter SAD:  -394.26169446051421   -3.94262e+02   0.00000e+00
   @DF-RKS iter   1:  -394.06168853243605    2.00006e-01   2.35318e-03 DIIS
   @DF-RKS iter   2:  -393.60436943038036    4.57319e-01   3.05091e-03 DIIS
   @DF-RKS iter   3:  -394.57382453452652   -9.69455e-01   6.28109e-04 DIIS
   @DF-RKS iter   4:  -394.61107147033198   -3.72469e-02   2.66234e-04 DIIS
   @DF-RKS iter   5:  -394.61789851734943   -6.82705e-03   6.53074e-05 DIIS
   @DF-RKS iter   6:  -394.61826970540233   -3.71188e-04   2.74528e-05 DIIS
   @DF-RKS iter   7:  -394.61833660166815   -6.68963e-05   8.90611e-06 DIIS
   @DF-RKS iter   8:  -394.61834354219104   -6.94052e-06   3.02056e-06 DIIS
   @DF-RKS iter   9:  -394.61834432913048   -7.86939e-07   8.44151e-07 DIIS
   @DF-RKS iter  10:  -394.61834439312383   -6.39934e-08   3.38353e-07 DIIS
   @DF-RKS iter  11:  -394.61834440373599   -1.06122e-08   1.36312e-07 DIIS
   @DF-RKS iter  12:  -394.61834440588660   -2.15061e-09   3.42201e-08 DIIS
   @DF-RKS iter  13:  -394.61834440601700   -1.30399e-10   1.03477e-08 DIIS
   @DF-RKS iter  14:  -394.61834440602786   -1.08571e-11   3.58439e-09 DIIS
  Energy and wave function converged.


  @DF-RKS Final Energy:  -394.61834440602786

   => Energetics <=

    Nuclear Repulsion Energy =            357.6857892894894917
    One-Electron Energy =               -1244.9168228111495864
    Two-Electron Energy =                 532.4733879917082504
    DFT Exchange-Correlation Energy =     -39.8606988760761212
    Empirical Dispersion Energy =           0.0000000000000000
    VV10 Nonlocal Energy =                  0.0000000000000000
    Total Energy =                       -394.6183444060279157

Computation Completed


Properties will be evaluated at   0.000000,   0.000000,   0.000000 [a0]

Properties computed using the SCF density matrix

  Nuclear Dipole Moment: [e a0]
     X:  2577.9784      Y:  2631.6165      Z:  3511.7797

  Electronic Dipole Moment: [e a0]
     X: -2579.9422      Y: -2631.1101      Z: -3510.0987

  Dipole Moment: [e a0]
     X:    -1.9638      Y:     0.5065      Z:     1.6810     Total:     2.6342

  Dipole Moment: [D]
     X:    -4.9915      Y:     1.2874      Z:     4.2728     Total:     6.6954


*** tstop() called on swa20 at Tue Apr 26 16:56:27 2022
Module time:
        user time   =     191.44 seconds =       3.19 minutes
        system time =      17.54 seconds =       0.29 minutes
        total time  =         28 seconds =       0.47 minutes
Total time:
        user time   =     191.44 seconds =       3.19 minutes
        system time =      17.54 seconds =       0.29 minutes
        total time  =         28 seconds =       0.47 minutes

*** tstart() called on swa20
*** at Tue Apr 26 16:56:27 2022


         ------------------------------------------------------------
                                   SCF GRAD
                          Rob Parrish, Justin Turney,
                       Andy Simmonett, and Alex Sokolov
         ------------------------------------------------------------

  ==> Basis Set <==

  Basis Set: AUG-CC-PVTZ
    Blend: AUG-CC-PVTZ
    Number of shells: 157
    Number of basis functions: 483
    Number of Cartesian functions: 565
    Spherical Harmonics?: true
    Max angular momentum: 3

  ==> DFJKGrad: Density-Fitted SCF Gradients <==

    Gradient:                    1
    J tasked:                  Yes
    K tasked:                  Yes
    wK tasked:                  No
    OpenMP threads:              8
    Integrals threads:           8
    Memory [MiB]:            35762
    Schwarz Cutoff:          1E-12
    Fitting Condition:       1E-10

   => Auxiliary Basis Set <=

  Basis Set: (AUG-CC-PVTZ AUX)
    Blend: AUG-CC-PVTZ-JKFIT
    Number of shells: 310
    Number of basis functions: 1062
    Number of Cartesian functions: 1323
    Spherical Harmonics?: true
    Max angular momentum: 4

  ==> DFT Potential <==

   => LibXC <=

    Version 5.1.5
    S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A. Marques, SoftwareX 7, 1 (2018) (10.1016/j.softx.2017.11.002)

   => Composite Functional: PBE0 <=

    PBE0 Hyb-GGA Exchange-Correlation Functional

    C. Adamo and V. Barone, J. Chem. Phys. 110, 6158 (1999) (10.1063/1.478522)
    M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999) (10.1063/1.478401)

    Deriv               =              1
    GGA                 =           TRUE
    Meta                =          FALSE

    Exchange Hybrid     =           TRUE
    MP2 Hybrid          =          FALSE

   => Exchange Functionals <=

    0.7500   Perdew, Burke & Ernzerhof

   => Exact (HF) Exchange <=

    0.2500               HF

   => Correlation Functionals <=

    1.0000   Perdew, Burke & Ernzerhof

   => LibXC Density Thresholds  <==

    XC_HYB_GGA_XC_PBEH:  1.00E-15

   => Molecular Quadrature <=

    Radial Scheme          =       TREUTLER
    Pruning Scheme         =           NONE
    Nuclear Scheme         =       TREUTLER

    BS radius alpha        =              1
    Pruning alpha          =              1
    Radial Points          =             75
    Spherical Points       =            302
    Total Points           =         277283
    Total Blocks           =           2066
    Max Points             =            256
    Max Functions          =            472
    Weights Tolerance      =       1.00E-15


  -Total Gradient:
     Atom            X                  Y                   Z
    ------   -----------------  -----------------  -----------------
       1        0.013609395442    -0.009632570740    -0.029121409643
       2       -0.023365455876     0.014081711049     0.046032649988
       3        0.001203305650    -0.001021331766    -0.005663273070
       4        0.043212291116     0.064989426727    -0.026237931758
       5       -0.007270478993    -0.019129731132    -0.003594559190
       6        0.001195380910    -0.001879082606     0.019053027326
       7       -0.006990542190    -0.068237086983    -0.016624145434
       8       -0.000874356590     0.001888953201    -0.003861666472
       9        0.000381007609     0.011526901823     0.002135912776
      10       -0.008933112926     0.044327291443     0.018208680249
      11       -0.023770623177     0.008791743140     0.014981713396
      12        0.020460796361     0.002643871524    -0.010522651779
      13       -0.008856770340    -0.048362008978    -0.004785903728


*** tstop() called on swa20 at Tue Apr 26 16:56:40 2022
Module time:
        user time   =      67.71 seconds =       1.13 minutes
        system time =       3.68 seconds =       0.06 minutes
        total time  =         13 seconds =       0.22 minutes
Total time:
        user time   =     259.17 seconds =       4.32 minutes
        system time =      21.22 seconds =       0.35 minutes
        total time  =         41 seconds =       0.68 minutes

Wall Time:       42.73 seconds

                                                       Time (seconds)
Module                                       User      System        Wall        Calls
V: Grid                             :      2.483u      0.083s      0.201w      1 calls
DFH: sparsity prep                  :      0.517u      0.033s      0.374w      1 calls
Libint2ERI::Libint2ERI              :     10.683u      0.233s      3.449w     42 calls
DFH: initialize()                   :     28.450u      4.350s      2.828w      1 calls
DFH: AO Construction                :      9.417u      0.867s      0.773w      1 calls
DFH: AO-Met. Contraction            :     13.250u      2.917s      1.375w      1 calls
HF: Form core H                     :      1.633u      0.033s      0.135w      1 calls
HF: Form S/X                        :      0.633u      0.017s      0.049w      1 calls
HF: Guess                           :      4.300u      0.217s      0.360w      1 calls
SAD Guess                           :      4.250u      0.217s      0.356w      1 calls
HF: Form G                          :    259.117u     19.367s     20.953w     15 calls
RV: Form V                          :    213.883u      9.483s     16.788w     15 calls
Properties                          :     53.632p                          33056 calls
Functional                          :      3.826p                          33056 calls
V_xc                                :     88.330p                          30990 calls
JK: D                               :      0.033u      0.000s      0.006w     15 calls
JK: USO2AO                          :      0.183u      0.017s      0.009w     15 calls
JK: JK                              :     44.717u      9.850s      4.124w     15 calls
DFH: compute_JK()                   :     44.717u      9.850s      4.124w     15 calls
DFH: Grabbing AOs                   :      0.000u      0.000s      0.000w     15 calls
DFH: compute_J                      :     12.200u      7.733s      1.514w     15 calls
DFH: compute_K                      :     20.350u      0.350s      1.556w     15 calls
JK: AO2USO                          :      0.000u      0.000s      0.000w     15 calls
HF: Form F                          :      0.183u      0.033s      0.012w     15 calls
HF: Form D                          :      0.067u      0.000s      0.006w     15 calls
HF: DIIS                            :      3.983u      0.450s      0.345w     14 calls
DIISManager::add_entry              :      1.383u      0.167s      0.120w     14 calls
DIISManager::extrapolate            :      1.617u      0.167s      0.138w     14 calls
bMatrix setup                       :      0.983u      0.117s      0.084w     14 calls
bMatrix pseudoinverse               :      0.050u      0.000s      0.001w     14 calls
New vector                          :      0.567u      0.050s      0.052w     14 calls
HF: Form C                          :      7.317u      0.117s      0.555w     14 calls
Grad: V T Perturb                   :      6.267u      0.183s      0.505w      1 calls
Grad: S                             :      0.067u      0.000s      0.005w      1 calls
Grad: JK                            :     75.383u      5.817s     10.136w      1 calls
JKGrad: Amn                         :     22.983u      1.717s      3.893w      1 calls
JKGrad: Awmn                        :      0.000u      0.000s      0.000w      1 calls
JKGrad: AB                          :      2.350u      0.067s      0.182w      1 calls
JKGrad: UV                          :      0.267u      0.017s      0.022w      1 calls
JKGrad: ABx                         :      0.367u      0.017s      0.029w      1 calls
JKGrad: Amnx                        :     48.550u      3.950s      5.935w      1 calls
Grad: XC                            :     31.050u      0.133s      2.379w      1 calls
V_xc gradient                       :      6.143p                           2066 calls

--------------------------------------------------------------------------------------
V: Grid                             :      2.483u      0.083s      0.201w      1 calls
DFH: sparsity prep                  :      0.517u      0.033s      0.374w      1 calls
| Libint2ERI::Libint2ERI            :      0.333u      0.033s      0.261w      1 calls
DFH: initialize()                   :     28.450u      4.350s      2.828w      1 calls
| Libint2ERI::Libint2ERI            :      1.783u      0.067s      0.190w      9 calls
| DFH: AO Construction              :      9.417u      0.867s      0.773w      1 calls
| DFH: AO-Met. Contraction          :     13.250u      2.917s      1.375w      1 calls
HF: Form core H                     :      1.633u      0.033s      0.135w      1 calls
HF: Form S/X                        :      0.633u      0.017s      0.049w      1 calls
HF: Guess                           :      4.300u      0.217s      0.360w      1 calls
| SAD Guess                         :      4.250u      0.217s      0.356w      1 calls
HF: Form G                          :    259.117u     19.367s     20.953w     15 calls
| RV: Form V                        :    213.883u      9.483s     16.788w     15 calls
| | Properties                      :     41.606p                          30990 calls
| | Functional                      :      3.586p                          30990 calls
| | V_xc                            :     88.330p                          30990 calls
| JK: D                             :      0.033u      0.000s      0.006w     15 calls
| JK: USO2AO                        :      0.183u      0.017s      0.009w     15 calls
| JK: JK                            :     44.717u      9.850s      4.124w     15 calls
| | DFH: compute_JK()               :     44.717u      9.850s      4.124w     15 calls
| | | DFH: Grabbing AOs             :      0.000u      0.000s      0.000w     15 calls
| | | DFH: compute_J                :     12.200u      7.733s      1.514w     15 calls
| | | DFH: compute_K                :     20.350u      0.350s      1.556w     15 calls
| JK: AO2USO                        :      0.000u      0.000s      0.000w     15 calls
HF: Form F                          :      0.183u      0.033s      0.012w     15 calls
HF: Form D                          :      0.067u      0.000s      0.006w     15 calls
HF: DIIS                            :      3.983u      0.450s      0.345w     14 calls
| DIISManager::add_entry            :      1.383u      0.167s      0.120w     14 calls
| DIISManager::extrapolate          :      1.617u      0.167s      0.138w     14 calls
| | bMatrix setup                   :      0.983u      0.117s      0.084w     14 calls
| | bMatrix pseudoinverse           :      0.050u      0.000s      0.001w     14 calls
| | New vector                      :      0.567u      0.050s      0.052w     14 calls
HF: Form C                          :      7.317u      0.117s      0.555w     14 calls
Grad: V T Perturb                   :      6.267u      0.183s      0.505w      1 calls
Grad: S                             :      0.067u      0.000s      0.005w      1 calls
Grad: JK                            :     75.383u      5.817s     10.136w      1 calls
| JKGrad: Amn                       :     22.983u      1.717s      3.893w      1 calls
| | Libint2ERI::Libint2ERI          :      3.967u      0.050s      1.499w      8 calls
| JKGrad: Awmn                      :      0.000u      0.000s      0.000w      1 calls
| JKGrad: AB                        :      2.350u      0.067s      0.182w      1 calls
| | Libint2ERI::Libint2ERI          :      0.017u      0.000s      0.001w      8 calls
| JKGrad: UV                        :      0.267u      0.017s      0.022w      1 calls
| JKGrad: ABx                       :      0.367u      0.017s      0.029w      1 calls
| | Libint2ERI::Libint2ERI          :      0.000u      0.000s      0.001w      8 calls
| JKGrad: Amnx                      :     48.550u      3.950s      5.935w      1 calls
| | Libint2ERI::Libint2ERI          :      4.583u      0.083s      1.497w      8 calls
Grad: XC                            :     31.050u      0.133s      2.379w      1 calls
| Properties                        :     12.026p                           2066 calls
| Functional                        :      0.240p                           2066 calls
| V_xc gradient                     :      6.143p                           2066 calls

**************************************************************************************

@ajemyunglee unrelated to the threading issue, note that you are using a generally contracted basis set, which is going to be extremely inefficient in Psi4. Using another program like PySCF that natively supports general contractions will be orders of magnitude faster.

1 Like

Interesting. I tried manually decontracting my basis set, i.e.

H     0
S    10   1.00
   5485.0000                 0.7866305532E-05
    803.5000                 0.1005295712E-03
    175.7000                 0.4292187586E-03
     49.6700                 0.2200643087E-02
     14.5300                 0.8841925973E-02
      4.7790                 0.3294507491E-01
      1.6430                 0.1062259599E+00
      0.5976                 0.2801966826E+00
      0.2213                 0.4671465639E+00
      0.0834                 0.2454942559E+00

turns into

H     0
S    1   1.00
   5485.0000                 0.7866305532E-05
S    1   1.00
    803.5000                 0.1005295712E-03
S    1   1.00
    175.7000                 0.4292187586E-03
S    1   1.00
     49.6700                 0.2200643087E-02
S    1   1.00
     14.5300                 0.8841925973E-02
S    1   1.00
      4.7790                 0.3294507491E-01
S    1   1.00
      1.6430                 0.1062259599E+00
S    1   1.00
      0.5976                 0.2801966826E+00
S    1   1.00
      0.2213                 0.4671465639E+00
S    1   1.00
      0.0834                 0.2454942559E+00

and the threading problem seemed to go away. So perhaps it’s not unrelated to the threading!

Thank you. More generally, should we try to avoid using generally contracted basis sets with psi4?

Fixed in 1.6. Look forward to the release later this month. If all things go well, later this week.

1 Like

Yes, segmented basis sets will run much faster.