Error in PSIO_WT_TOCLEN()

Kirk · July 28, 2021, 8:58pm

I’ve got a new build of psi4 running on a CentOS linux cluster that continues to fail for somewhat large jobs where I correlate nearly all the electrons. The only error shows up in the standard error file:

Error in PSIO_WT_TOCLEN()!

nothing posts to the output file. The file system has just over 5 TB of space and the largest psi4 file (file 103) is only 143 GB. The case is only a triatomic molecule (ClO2) in Cs symmetry with 351 active MOS. I’m attempting to do a BCCD calculation, but it always dies after the integral transformation but before the first CCSD iteration. Here is the last few lines:

    Size of irrep 0 of tIjAb amplitudes:       9.311 (MW) /     74.484 (MB)
    Size of irrep 1 of tIjAb amplitudes:       4.442 (MW) /     35.533 (MB)
    Total:                                    13.752 (MW) /    110.018 (MB)

Any ideas? This is reproducible for different molecules and on different compute nodes. Input is pretty simple:

ClO2 test BCCD

memory 5 gb

molecule = {
0 2
Cl
O 1 R1
O 2 R2 1 A
R1=2.03230554
R2=1.20810395
A=115.36876732

}

set {
reference rohf
basis aug-cc-pwcvqz
print_MOs true
print 2
scf_type pk
guess sad
freeze_core -2
}

energy(‘bccd’)

hokru · July 29, 2021, 1:17pm

I can confirm there error with my psi4 (latest conda psi4), but can offer no explanation.
It works with smaller basis sets.

The error means there is a problem writing metadata for a data entry but I am not familiar with this part of the code.

Kirk · July 29, 2021, 2:25pm

Thanks Holger for taking a look. Yes, the same input works for awcvtz. The same problem also occurs for a frozen-core calculation with av5z. So definitely a size thing.

jmisiewicz · July 29, 2021, 2:38pm

I’m fighting too many other issues to deal with this one, but it’s worth adding to the issue tracker.

One technical correction: this specific error message happens before Psi tries to write any files. It happens when Psi tries to move to the start of a file (SYSTEM_LSEEK). If you can reproduce this stably, the first thing I’d do is add a “file exists” check to the function.

Kirk · July 29, 2021, 2:50pm

Thanks Jonathon. I’ve reproduced it with CCSD as well and am checking a closed-shell calculation now. Up until it dies it’s certainly writing to a lot of files (100 GB or more), so I guess the first thing is to figure out which new file is it getting ready to write to when it dies.

Rollin_King · July 29, 2021, 2:57pm

I don’t have a fix probably, but it would be helpful to know if you see the same behavior after adding the option

cachelevel = 0

Kirk · July 29, 2021, 2:59pm

Thanks Rollin, I’ll check it out.

Kirk · July 29, 2021, 3:59pm

I’m testing a bit more, but setting cachelevel to 0 seems to have fixed it. Might be worth a note in the docs that this doesn’t just affect memory issues but can show up as IO errors as well (for some reason)

Kirk · July 29, 2021, 6:21pm

yep, problem solved. Thanks all!

system · September 27, 2021, 6:21pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.