It seems that whenever I run psi4 it is running with nested parallelism…
When I run a calculation with $OMP_NUM_THREADS=n it the jobs will run with 100*n^2% CPU. Putting OMP_NESTED=FALSE in the submission script does nothing and OMP_THREAD_LIMIT=n puts it to 1 thread whenever it wants to implement nested parallelism. I was wondering if there was an easy fix.
I compiled it myself. I also do have the precompiled psi4conda which doesn’t have this issue (but is considerably slower still). I can’t imagine there are any OMP environment variables I am unaware of…
OK… the problem is without the KMP_DUPLICATE_LIB_OK=TRUE setting I get the error message:
“OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already initialized.
OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library.”
It sounds like I need to recompile the code. My friend also said this issue may have arisen because I first downloaded the psi4conda binary before compiling psi4 from github, With this in mind, do you have any suggestions for recompiling the code?
My intel v15 is too old for the developer’s C++11 enthusiasm
@bge6: For starting it is perhaps best to build everything from freshly from scratch, ie no psi4-libraries from conda, mkl is ok. Empty install PREFIX, etc. Study the initial cmake output if everything is build from source.
I’m also having this issue on a compile-from-source I’m doing with release 1.2. I’ve installed mkl using the academic license in the customary place for linux/ubuntu distributions (it is installed through the .deb intel distributes, if I recall correctly).
Recompiling doesn’t fix my issue on its own; I think that somewhere in the compilation process cmake is finding different libiomp5.so files, and is somehow linking both of them? I’m not sure how to hide the version not coming from my mkl install. Here’s my cmake script:
Peculiarly, some tests succeed. Based on the names of the tests, I suspect these are the tests that do not need compiled parallelism to execute, for example ones using psi4-numpy. Here’s an example output from a test that failed:
OMP: Error #15: Initializing libiomp5.so, but found libomp.so.5 already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
Exit Status: infile ( -6 ); autotest ( None ); sowreap ( None ); overall ( 1 )
Test time = 1.00 sec
“cc1” end time: Dec 28 17:15 EST
“cc1” time elapsed: 00:00:00
I’d be happy to provide any additional information, but without knowing what might be useful it’s hard for me to know what else I should include. I have the entire compile process output logged, if that seems useful (it’s a lot of text so not copying it here).
Is your numpy from conda? It’s safest to have have all the LAPACK and all the OpenMP requirements linking to the same library (Psi4, CheMPS2, libefp, NumPy all need LAPACK and all those plus LAPACK itself need the openmp library). Fortunately, this is easy to do in conda.
If you’re using your own compilers or you really want to compile all the dependencies yourself, you will need a conda env, e.g., conda create -n p4dev numpy intel-openmp mkl-devel, conda activate p4dev. Then, on your cmake line, add
Warning is lurking here (search “Because of how link loaders work”). There are so many seemingly innocent ways the trouble can get introduced that controlling the build environment through the conda recommendation is the best sure-fire way to get the omp right.