Multi-threading two-electron integrals

I’m new to Psi4, and learning to perform geometry optimizations. In a CCSD geometry optimization, I noticed that the step following “Computing two-electron integrals…” is taking a long time and appears to be using only one CPU. Might this be easy to improve?

If I’m reading the code correctly, this is printed at psi4/psi4/src/psi4/libmints/mintshelper.cc at master · psi4/psi4 · GitHub and the following code iterates over an enumerated list of possible shell combinations calling
eri->compute_shell(shellIter, writer)
in which eri is a TwoBodySOInt. It appears to me (psi4/psi4/src/psi4/libmints/sointegral_twobody.h at master · psi4/psi4 · GitHub) that compute_shell is intended to be called in an OpenMP thread. Is there a reason why this code path doesn’t dispatch the compute_shell calls across the per-thread instances that it sets up for the purpose (psi4/psi4/src/psi4/libmints/mintshelper.cc at master · psi4/psi4 · GitHub)?