Memory usage using jupyter (writing cubes)

jcerezo · May 13, 2021, 6:49pm

Dear all,

I am implementing some computational chemistry labs that we were previously running with a commercial code using psi4 and jupyter. It seems almost all steps we used to follow can be adapted properly, however, the memory usage seems to be too large and never released. Namely, after running and SCF calculation, the orbitals are printed to cube files. This actually takes a big amount of memory for each file generated.

It seems this is a known issue related to the fact that CPython is not able to release memory while running (rather than a problem on Psi4 side) and I think I can solve it by running the writing step on a child process using the multiprocessing module as suggested here. Anyway, I was wondering if anyone has a better alternative to reduce memory usage.

Thanks!

Regards,

Javier

hokru · May 14, 2021, 7:47am

We know about memory leaks after repeated SCF calculations in python shells or notebooks.
Usually notable when running hundreds of calculations.
C-side we did a lot of cleaning and checking but there still remain issues somewhere we could not identified yet.

You say you noticed an issue specifically with fchk files as well? Which psi4 version are you using?

jcerezo · May 14, 2021, 12:01pm

Thanks for the quick reply!

I am using version 1.3.2. I noticed the issue when using psi4.driver.p4util.cubeprop() as, apparently, memory is being consumed even if files are written to disk. It seems to be a problem on the Python side as indicated in the previous post.

Admittedly, the memory usage is not huge (for a diatomic as Cl2, it takes ~30MB to generate valence MO), and can be somehow limited by wisely tuning the options. However, I was initially expecting no memory usage since files are written to disk. Moreover, since I am using fchk generation along with a scan over a bond length, the memory usage can rise depending on the number of steps (normally 10 at least), and, also important, we were planning to deploy over a server for the students using JupyterHub, so the memory consumption scales with the number of students (~50 at a time). So, a way to avoid unnecessary memory usage would help. Just as an example, the whole notebook (including optimization and scan along bond-length, and visualization with NGLview) takes ~700MB (Cl2) or ~450MB (Li2), where nearly half of the memory usage happens at cubeprop().

By the way, things seem to go worst, regarding memory consumption when using 1.4, as mentioned in a related post: Memory taken importing psi4 module (v1.4)
For instance, the example with Li2 reaches up to 1GB (where most of the memory usage comes from loading the psi4 module).

Thanks again!

Javier

hokru · May 14, 2021, 1:59pm

I am a bit confused, fchk() and cubeprop() are different functions.
So also cubeprop() shows a memory consumptions issue?

jcerezo · May 14, 2021, 2:21pm

Ups, you’re totally right. I was meaning cube generation from the beginning (not fchk). Sorry for the misunderstanding (I’ve updated the topic title accordingly)

hokru · May 14, 2021, 2:36pm

Thanks for the clarification. I opened an issue on github: memory leak in cubeprop() · Issue #2181 · psi4/psi4 · GitHub