Unusual deletion of SAPT calculation: Need an instant help please

Dear Users,
I have recently installed the Nightly build binary (Linux only) version of the Psi4 code.
My system is reasonably big (36x2 = 72 atoms, or 24x2 = 48 atoms). I have decomposed the super-molecule into two parts. This was possible considering it as the consequence of two monomers. Each monomer has a charge of 0 and multiplicity of 1 (please see below). When I ran a SAPT0 calculation for this system, I could end up with the following error even after increasing the memory to some large extent. I need your kind and favorable advises in resolving this problem.

Following are the last lines the output file.

.


*** tstop() called on node001 at Thu Dec 10 12:12:41 2015
Module time:
user time = 1804.65 seconds = 30.08 minutes
system time = 32.97 seconds = 0.55 minutes
total time = 1838 seconds = 30.63 minutes
Total time:
user time = 9769.61 seconds = 162.83 minutes
system time = 131.23 seconds = 2.19 minutes
total time = 10053 seconds = 167.55 minutes

//>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>//
// SAPT0 //
//<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<//

*** tstart() called on node001
*** at Thu Dec 10 12:12:41 2015

    SAPT0
Ed Hohenstein
 6 June 2009

  Orbital Information

NSO        =      1104
NMO        =      1104
NRI        =      3456
NRI (Elst) =      4128
NOCC A     =        87
NOCC B     =        87
FOCC A     =        24
FOCC B     =        24
NVIR A     =      1017
NVIR B     =      1017

Elst10,r            =    -0.000048414448 H
Exch10              =     0.001101146326 H
Exch10 (S^2)        =     0.001100550800 H

================================
The input of the above output is something like the following (I did not paste the whole coordinates here). Please note that I have tested the memory upto 65GB, although in the following example I have shown it for 5GB.

#! xxxxxxxxxxxxxxxx

memory 5000 mb

molecule cluster {
0 1



0 1



units angstrom
}

set globals {
basis aug-cc-pvdz
df_basis_scf aug-cc-pvdz-jkfit
df_basis_sapt aug-cc-pvdz-ri
df_basis_elst aug-cc-pvdz-jkfit
guess sad
scf_type df
puream true
print 1
basis_guess true
}

set sapt {
freeze_core true
aio_cphf true
aio_df_ints true
}

energy(‘sapt0’)

Eelst = psi4.get_variable(“SAPT ELST ENERGY”)
Eexch = psi4.get_variable(“SAPT EXCH ENERGY”)
Eind = psi4.get_variable(“SAPT IND ENERGY”)
Edisp = psi4.get_variable(“SAPT DISP ENERGY”)
ET = psi4.get_variable(“SAPT SAPT0 ENERGY”)

On a substitute molecule, I find no particular problem with your input. There isn’t actually an error in the output you copied. Simplest things to check: (a) are you capturing stdout/stderr to see if there was something thrown? (b) the end you have is where cphf iterations start- they may just take awhile. © you definitely set PSI_SCRATCH so your big job isn’t just filling up /tmp? (d) you’ve run a calc with this input on a smaller system and checked that the sapt finishes properly and that scratch is going to the right place?

Dear Sir,
Thank you very much for your kind explanations. It is not a problem with the PSI4 scratch. For instance, in my pbs script, I refer it to the /SCRATCH directory designed for this specific purpose. We normally use this scratch folder (in different nodes) for GAUSSSIAN, GAMESS, VASP, and SIESTA calculations, etc, and such calculations do not end up with kind of disk space problems, similar to the one you mentioned above. Please advice more if possible so that it can help me to fix the problem. Thank you very much once again for your kind tips.

Have you tried it without the asynchronous IO keywords? Otherwise, I’ll need to know your molecule to diagnose any further. You can send it to psi4aiqc+feedback Gmail address if you’d rather not post the structure.

Thank you sir. Can I have the gmail address please?

Add @gmail.com to the aforementioned address.

Dear Dr. Loriab,
I am very much thankful for your previous responses. As per your suggestion, I have sent you the structure of the said molecule through an email communication several days ago. I did not hear from you about anything on this.

Regards
Pradeep

Dear Pradeep,

This would be a lot easier for us to figure out if you would paste in the actual error message into one of your posts. Otherwise, it is impossible for us to guess what might have gone wrong.

Best,
David

Dear Sir,
I had a few communications with Loriab, as you can see above. He advised me to send him the input file if I cannot paste it here due to some reasons. I could, however, send him the input file of the said system through email as an attachment. It is now over 10 days, I have not heard anything from him.

In any case, I have had no error message for my input. The running job just terminates at a given position without any error. I have already copied the final output to this blog in my first post.

Thank you very much.
Bests,
Pradeep

I used 250GB RAM for the system with 76 atoms. The error we encountered is as follows:

/bin/rm: No match.
Traceback (most recent call last):
File “”, line 109, in
File “/usr/local/psi4/4.0b5_intel/share/psi/python/driver.py”, line 549, in energy
procedures[‘energy’][lowername](lowername, **kwargs)
File “/usr/local/psi4/4.0b5_intel/share/psi/python/proc.py”, line 1782, in run_sapt
e_sapt = psi4.sapt()
RuntimeError: Not enough memory
file: /raid/usrlocal/src/PSICODE/psi4.0b5/src/lib/libsapt_solver/sapt2.cc
line: 211

=====================================================================
Resource Usage
Fri Jan 29 20:24:20 WST 2016
Job Id: 110248. …
Job Name: 2-on-top-BEWZEJ_20core_256GB.job
Queue: sleep
Exit Status: 0
Requested Used
NCPUs: 20
CPU Time: 09:37:10
Memory: 0mb 75233mb
VMem: 76725mb
Walltime: 1584:00:00 00:38:18
Temp Dir: 0mb 0mb

Sorry, I have not received your structure. psi4aiqc+feedback@gmail.com

Thank you Loriab. I did manage to solve the problem by myself. I had to use 250 GB ram and 20 cores for this. This gives me a feeling that the code demands too much memory for the calculation.

I sent an email to: loriab@gmail.com

Something doesn’t seem right… 76 atoms should not require 256 GB of RAM for a SAPT0 computation…

David

The log file that I got after the run is identical to the low memory jobs I had from several previous jobs on other systems. Then how to confirm what is not seemingly right?

I have now forwarded again my structure containing email to: psi4aiqc+feedback@gmail.com