Psi4 exitfunctions error messages at the end of multi-step parallel calculation

Hi,

I’m running a multi-step high throughput calculation written in python using openmpi & mpi4py on 100+ molecules using psi4 as the first stage to optimise each molecule on my universities HPC using the slurm scheduler. The calculation is using a master/ worker set up so I can run multiple optimisations at the same time.

In the job submission directory multiple folders are created, one for each molecule and then in those sub-folders a following sub-folder is created for each step of the calculation. e.g.

Submission Directory-> Molecule 1 -> Psi4 (Calc 1)
                       Molecule 2    Calc 2
                       etc           Calc 3

(In this process only calc 1 uses psi4 the rest use other software)

Computation wise everything works fine and the calculations finish successfully. However at the end of the calculation when all the steps and molecules have completed it appears like psi4 is still active in the background and runs its exit functions at that point opposed to when the individual calculations have completed outputting the errors such as in the attached file.

psi4shutdown_errors.txt (7.8 KB)

Is this because the timer.dat file is being created in the submission directory opposed to the individual molecules folders or perhaps as my code is entirely written in python using the psi4 python module?

Additionally, is there a way to either suppress these message or to ideally fully shut down psi4 for each call after completing? My main concern is that psi4 is still taking up memory in the background.

Below is my psi4 optimisation section of the code.

def psi4_calculate(self, coordinates, calculation_type, output_file_path):
        psi4.core.set_num_threads(8)
        psi4.set_memory("4000MB")

        psi4.set_options({"basis":self.basis_set,
                        "geom_maxiter":50,
                        "MAXITER":100,
                        "G_CONVERGENCE": "GAU",
                        "D_CONVERGENCE": 6,
                        "E_CONVERGENCE": 6})

        xyz = ""
        for coordinate in coordinates:
            xyz += f"{self.converter(coordinate[0])} {coordinate[1]} {coordinate[2]} {coordinate[3]}\n"

        mol = psi4.geometry(xyz)

        if calculation_type == "opt":
            if os.path.exists(output_file_path) and os.path.exists(f"{output_file_path[:-4]}.xyz"):
                return self.psi4_results_reader(calculation_type,output_file_path)
            else:
                psi4.core.set_output_file(output_file_path)
                energy = psi4.optimize(self.method)
                xyz_to_save = str(mol.natom()) + "\n"
                xyz_to_save += mol.save_string_xyz()

                with open(f"{output_file_path[:-4]}.xyz", "w") as saved_xyz:
                    saved_xyz.write(xyz_to_save)
                coordinates = coordinates_from_xyz_file(f"{output_file_path[:-4]}.xyz")

                psi4.core.clean()
        return energy, coordinates

Thank you
Jay

I see this topic, but my personal queue is taken up by other issues.

I’ll warn you that there is lots of information missing because you’re asking questions about a very complicated piece of code. A minimal working example would be much appreciated.

Nothing about your shutdown errors suggests to me that Psi4 is taking memory in background. That should all be cleared upon a FatalError.

Typically on an HPC, at least with my experience with LSF is that the optimization jobs fail if there is no PSI_SCRATCH variable set to a storage destination with plenty of storage space. For example optimization of a molecule say of 60 atoms can create up to 10 GB of temporary space with binary numpy files for grids and other matrice storage. But it sounds like you already did this maybe, as I reread you post saying you are seeing optimization complete?

Maybe sure you have set PSI_SCRATCH to a variable in your bashrc as indicated for a similar wrapper poltype2 which only uses PSI4 to optimize ligands. Is your code published on github?

Typical things to check @loriab talks about here Strange crash on Psi4. In particular pay attention to her comment “Also, Psi4 won’t benefit from running under MPI. Its strength is intranode parallelism.”

Hi,

Thank you for your replies on this and suggestions. I’ve been trying to make a minimal working example as suggested and haven’t so far been able to make a representative version. However, whilst attempting to do so I believe I’ve noticed the issue is essentially due to how I’m closing down the workers that have used psi4, using the master process to send shutdown commands to them, and as a result more to do with my code and implementation than psi4. As a result I believe this topic can be closed as doesn’t directly relate to an issue with psi4.

Thank you for your suggestions and the reassurance about the memory.
Jay