Hi, I am running SAPT calculation in Psi4 via Python on a cluster managed by SLURM Workload Manager. I was running my calculation on single node till now. I was wonder if it was possible to distribute my calculations to multiple nodes via OpenMPI or something similar. Can someone tell if this is possible?
I reserved two nodes with SLURM and set the number of threads to 56 (2 nodes x 28 threads) in my Psi4 Python script via psi4.set_num_threads(), but it looks like it’s still only running on one node.
Can someone help me figure out how to set this up correctly? Any tips or examples would be super appreciated!
Thanks a bunch!
I don’t know if I understand your question, but your question seems to be more about SLURM. I run psi4 on both the SLURM and PBS queuing systems.
If you have 2 nodes with 24 cores each and want to run 4 calculations simultaneously with 12 cores, below is an example of a SLURM job:
#!/bin/bash
#SBATCH -n 24
#SBATCH --nodes 2
#### other information ####
#### load libraries ####
export OMP_NUM_THREADS=1
#### job execution ####
(psi4 -i calc_1.dat -o calc_1.out -n 12) &
(psi4 -i calc_2.dat -o calc_2.out -n 12) &
(psi4 -i calc_3.dat -o calc_3.out -n 12) &
(psi4 -i calc_4.dat -o calc_4.out -n 12) &
wait
(psi4 -i calc_5.dat -o calc_5.out -n 12) &
(psi4 -i calc_6.dat -o calc_6.out -n 12) &
(psi4 -i calc_7.dat -o calc_7.out -n 12) &
(psi4 -i calc_8.dat -o calc_8.out -n 12) &
wait
...
Thanks a lot for your reply . What I actually want to do is distribute a single calculation across two nodes. I reserved two nodes using SLURM and tried running the calculation, but it seems like it’s only using one of the reserved nodes. I’m using PSI4 with Python, though in bash, the process would look something like this. Is this possible?
#SBATCH --nodes 2
export OMP_NUM_THREADS=1
psi4 -i calc.dat -o calc.out -n 56