RuntimeError SAPT Calculation in a cluster

Dear colleagues,

I wrote a simple python script (archive psi4teste.py) to run 2 jobs in a PBS queue in a cluster (psi41.pbs and psi42.pbs files) to calculate the SAPT energy (sapt2+(3)dmp2/aug-cc-pVTZ level) of 2 files (metano1_1.dat and metano1_2.dat files).

However, the energy calculations crash and the output files are incomplete (metano1_1.out and metano1_2.out files), giving the error below:

Traceback (most recent call last):
File “/ home / sw / anaconda3 / envs / p4env / bin / psi4”, line 287, in
exec (content)
File “”, line 33, in
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/driver.py”, line 556, in energy
wfn = procedures [‘energy’] [lowername] (lowername, molecule = molecule, ** kwargs)
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/procrouting/proc.py”, line 3309, in run_sapt
dimer_wfn = scf_helper (‘RHF’, molecule = sapt_dimer, ** kwargs)
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/procrouting/proc.py”, line 1363, in scf_helper
e_scf = scf_wfn.compute_energy ()
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/procrouting/scf_proc/scf_iterator.py”, line 84, in scf_compute_energy
self.initialize ()
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/procrouting/scf_proc/scf_iterator.py”, line 198, in scf_initialize
self.initialize_jk (self.memory_jk_, jk = jk)
File “/home/sw/anaconda3/envs/p4env/lib//python3.7/site-packages/psi4/driver/procrouting/scf_proc/scf_iterator.py”, line 125, in initialize_jk
jk.initialize ()

RuntimeError:
Fatal Error: PSIO Error
Error occurred in file: /scratch/psilocaluser/conda-builds/psi4-multiout_1557940846948/work/psi4/src/psi4/libpsio/error.cc online: 128
The most recent 5 function calls were:

psi::PSIO::rw(unsigned long, char*, psi::psio_address, unsigned long, int)
psi::PSIO::write_entry(unsigned long, char const*, char*, unsigned long)
psi::DiskDFJK::initialize_JK_core()

Any suggestions as to what this error might be?

All the files i mentioned are available in this link:
https://drive.google.com/drive/folders/184LIMsbsYQkGD6mpyMH9qgIix74GsQIf?usp=sharing

Sorry for my lack of experience.

Thanks in advance for your time!

Is there a reason you can’t just upload the output files to your post? I apparently need permissions to look at your output files.

Without seeing your input or output files, I can’t say for certain, but I would bet that the computation requires more disk space and/or memory than Psi has available.

@ jmisiewicz

Thanks for reply.

I gave access permission to anyone with the link now, can you kindly try again?:
https://drive.google.com/drive/folders/184LIMsbsYQkGD6mpyMH9qgIix74GsQIf?usp=sharing

Ps1: I tried to attach the files, but the forum said that the extensions .dat (inputs) and .out (outputs) were not acceptable. So I decided to make it available via link.

Ps2: Apparently I was normally running the scripts previously, but I must have done something wrong that is causing this problem.

Excuse me for the mistake.

Thanks in advance for your time.

Oh, this is an easy one. You need to specify the amount of memory you’re giving Psi4. See here.

@jmisiewicz

Thanks for reply.

I specified the amount of memory as indicated and the error still persists. The new input file was the one below (I also updated the new versions of the files in the link):

image

The output I got was as follows:


I found it very strange because even today this script was running normally and without errors.

I don’t know what I could have done wrong for the script to stop working.

Thanks in advance.

That’s more troubling. Now that I try to reproduce this error myself on the current Psi4, the computation finished just fine for me even without changing the memory. I’m not aware of any bugs we’ve fixed that might cause this behavior, which makes me think it’s a matter of cluster configuration.

Are you absolutely sure about which scratch directory you’re using for this computation? The conda nightly version of Psi4 will print out your scratch directory for every energy computation you do. If you aren’t willing to try nightly or want to answer fast, putting print(core.IOManager.shared_object().get_default_path()) should print out the scratch directory.

@jmisiewicz

Thanks for reply

I have now tried to submit my python script (same script, i.e., psi4teste.py) to run the 2 jobs in a PBS queue and I am getting the following error message:

image

Perhaps there is a problem with the cluster. Tomorrow I will contact the administrator of the cluster to investigate this problem that I am having.

I did some tests (same example script too) today and they were running normally, but others I canceled the execution of the psi4 script without finishing it. This may be a problem.

I’ll be back with news soon.

Again, thanks for your time and patience.

@jmisiewicz

Apparently the problem is that the temporary files being created by psi4 are being thrown in / tmp of the cluster and filling the directory.

Is there a way to throw the temporary files into the folder in which I am running the psi4 scripts (to clean them up eventually if necessary)?

If so, how could I do that?

Thank you for now.

I’m not seeing in your PBS or psiteste.py script that the scratch is being set anywhere. Hopefully you’re running on a cluster where the nodes have local scratch, as network scratch will be really slow and /tmp/, as you’ve found, is too small. If your cluster admin provides a scratch location in a variable, set it in the PBS file with either envvar PSI_SCRATCH or the psi4 ... -s cmdline option. Installation and Runtime Configuration . Some nodes on clusters these days don’t have local scratch, and that can be a problem for disk-dependent methods like higher order SAPT.

@loriab @jmisiewicz

Thanks for the comments.

Apparently I managed to solve the problem by adding the line below to my PBS files, sending the temporary files to the folder where my psi4 scripts are originally located

“export PSI_SCRATCH=/home/users/e119340/psi4”

Thanks for your time.

Best regards,

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.