SAPT2 calculation stops with a Segmentation fault error


#1

I am trying to run SAPT2 calculation using psi4 which was conda installed.
With 40 cores and a 100 Gigs of RAM. [Intel Xeon processors]

The job stops at the sapt2 calculation always showing the error:

forrtl:severe (174): SIGSEGV

Any help on how to rectify the error and move forward will be greatly appreciated.


#2

We can’t offer much help without you posting an input file.

Because the forum will interfere with your formatting otherwise, please enclose the file in triple backticks. It would also make things easier on us if you gave us as small of a file as possible that still produces the segfault.


#3

I am adding the input file

memory 100GiB




molecule AP_BMIMBF4 
{
      0 1
C          2.71424        3.35160        2.17047
C          2.61539        3.07721        0.69258
C          3.27155        1.81549        0.13150
C          3.01216        0.53303        0.90532
C          3.45716       -0.78046        0.26355
C          3.05053       -0.99330       -1.18650
C          3.20741       -2.40360       -1.74586
C          2.55026       -3.52443       -0.96032
F          3.17228        4.13514       -0.00592
F          1.27554        3.06932        0.32664
F          3.66787        0.59838        2.12267
F          1.67182        0.43073        1.22024
F          3.82500       -0.17529       -1.99644
F          1.75160       -0.57213       -1.38634
F          3.25533       -3.78096        0.19399
F          1.27730       -3.18683       -0.58308
H          2.25306        4.32050        2.37079
H          3.76241        3.37061        2.47326
H          2.19469        2.57073        2.72526
H          4.35180        1.97066        0.10411
H          2.92157        1.70418       -0.89385
H          4.54579       -0.84429        0.31346
H          3.04476       -1.57689        0.88214
H          4.27328       -2.63006       -1.82364
H          2.78870       -2.38708       -2.75512
H          2.50060       -4.45608       -1.52780
--
       0 1
C         -3.87349        2.91061       -1.65876
N         -2.75856        1.99410       -1.41455
C         -1.50263        2.33407       -0.94691
C         -0.83120        1.17172       -0.73344
C         -2.84151        0.66484       -1.47790
N         -1.68346        0.14262       -1.07953
C         -1.46679       -1.28577       -0.78728
C         -1.38542       -1.51827        0.72247
C         -1.59684       -2.99357        1.06458
C         -1.46598       -3.24276        2.56923
H         -3.87594        3.66326       -0.86989
H         -3.76218        3.38868       -2.63477
H         -4.79884        2.33892       -1.60023
H         -1.20045        3.35521       -0.79404
H          0.15181        0.98994       -0.34830
H         -3.72834        0.10789       -1.72147
H         -0.55449       -1.60196       -1.29485
H         -2.32462       -1.81798       -1.19641
H         -0.41387       -1.17340        1.09480
H         -2.16763       -0.92438        1.19970
H         -0.87575       -3.61428        0.52075
H         -2.59984       -3.27619        0.72873
H         -0.46552       -2.96983        2.92459
H         -2.19613       -2.64103        3.11964
H         -1.63860       -4.29508        2.81528
B         -4.99249       -0.20013        0.53015
F         -4.36607       -1.36290       -0.02014
F         -5.60912        0.50296       -0.55477
F         -3.96502        0.63394        1.05703
F         -5.91416       -0.53978        1.49939
     units angstrom
     no_reorient
     symmetry c1 
}



set{ basis 6-311+G(d,p)
     scf_type df
          
     DF_BASIS_SAPT 6-311+G(d,p)
     SAPT_DFT_FUNCTIONAL B3LYP-D

}
energy('sapt2', molecule=AP_BMIMBF4)

This input consistently produces the seg fault. I do not have a smaller input right now with me. will update if this reproduces with another input.


#4

When I tried running this computation in serial, I got an error kill: sending signal to 26528 failed: No such process and

File "/opt/vulcan/opt/vulcan/linux-x86_64/intel-16.0.1/psi4-master-avducsgnonqrcpftiesq432gagal6y7r/lib/psi4/driver/driver.py", line 492, in energy
    wfn = procedures['energy'][lowername](lowername, molecule=molecule, **kwargs)
  File "/opt/vulcan/opt/vulcan/linux-x86_64/intel-16.0.1/psi4-master-avducsgnonqrcpftiesq432gagal6y7r/lib/psi4/driver/procrouting/proc.py", line 3450, in run_sapt
    e_sapt = core.sapt(dimer_wfn, monomerA_wfn, monomerB_wfn)

RuntimeError: 
Fatal Error: Not enough memory
Error occurred in file: /home/vulcan/vadmin/programs/psi4/psi4/src/psi4/libsapt_solver/sapt2.cc on line: 217

You would think that since the job request 64566.6 MB, 100 GB would be plenty!

You will need more memory to even attempt this computation. I’ll file a bug report about how off the memory request is.


#5

While you’ve certainly identified some issues we should look into, for general advice

  • that’s a really big system for higher order SAPT. Have you done preparatory SAPT0
  • the idea of DF basis sets is to use one that’s bigger, usually much bigger, than the orbital basis. by setting df_basis_sapt to be the same as orbital you’re defeating that and I wouldn’t be surprised if that caused dimension problems somewhere.
  • SAPT_DFT_FUNCTIONAL is relevant to sapt(dft) and isn’t doing anything good here.

#6

@jmisiewicz

I ran the job with 125 Gigs of memory this time (maximum possible on the workstation i have). It still exits with the segmentation fault error. I would specifically point out here that, the system being studied is a combination of an ionic liquid BMIMBF4 with alpha-PVDF.

Although, The SAPT0 and SAPT2 calculation ends successfully for this ionic liquid , BMIMBF4 when ran separately.

@loriab

Yes the SAPT0 calculation for this system ran smoothly.

I re-ran the job removing the sapt_dft_functional facing the same error.

Any pointers on how to address your second concern would be helpful.

So should i (kind of) conclude that it’s the memory requirement that causes the problem ?

Update: Using the aug-cc-pvdz basis set still reproduces the fault.


#7

I have had a similar problem with segmentation faults during SAPT2+ jobs on large systems (268 electrons with aug-cc-pvdz basis). I think I identified and patched an integer overflow in the part of the code that computes the SAPT amplitudes, similar to the fixes described in this thread from last year. I can confirm that this patch fixes the segmentation fault for my job, but I haven’t tested it extensively. @loriab, would you be interested in having me try to push this to the psi4 master?


#8

@anakin

Thanks for the additional info – it sounds like you’ve done the prudent preparatory calcs. For the aux basis, if you don’t like the results of the defaulting mechanism, set the DF_BASIS_SAPT to one of the def2 fitting basis sets, as they have good periodic table coverage.

@ccavender

Yes, please do make a pull request. Integer overflow fixes very welcome!


#9

@ccavender

That is great to hear.
Is there any way i can access your patch and check it out for my system?

The thread that you directed to in your post, discusses about a patch rectifying an int_overflow fix which needs to be accessed from github. Sorry to say but i use a conda compiled binary of psi4.
A brief instruction set on how to install psi4 and the patch from github would be really helpful.

@loriab

Will try with a def2 fitting basis


#10

Thanks for all the help.

I compiled psi4 from source (used git) and ran the same job with an aug-cc-pvdz basis set.
It ran to completion in 5 hours 45 minutes. Maybe the changes mentioned in the thread @ccavender referred to are
already incorporated in the master branch of psi4.


#11

Those changes are not in the master branch of Psi.

Glad to hear it worked, but I’m still worried that it failed for me on a fairly up-to-date version of master. I can update it and try again.


#12

@anakin It looks like the integer overflow fix was merged into the master branch yesterday (Nov 26), so if you run into any further problems with large systems you can update your psi4 source and recompile to see if that helps.


SAPT0 on Ca, Mg complexes