Custom guess for Hartree-Fock

Dear Psi4 Community,

is there a way to use initial guesses that I have calculated myself instead of the sad guess?

The coefficient matrices, that I want to use for this purpose, are currently in the form of numpy arrays.

Kind regards

1 Like

While there is a way to do this, it operates by saving a wavefunction object to disk. If you can express this initial guess you’ve calculated yourself as a Psi4 wavefunction object, it can be done.

Do your initial guesses obey point group symmetry? Do you want to use your guesses for systems with symmetry? Are your guesses associated with a specific basis set?

Hey, thanks for replying!
Her is a minimal example:

    alpha_coeffs = []
    beta_coeffs = []

    for file in sorted(os.listdir(directory)):
            f = open(directory+file)
            coord_block = ""
            for line in f:
            coord_block += line
            psi4.geometry("\n"+coord_block
    	        )
    
            set basis def2-svp
            energy('scf')

            E_scf,wfn = energy('scf',return_wfn=True)        
            alpha_coeffs.append(wfn.Ca().to_array())
            beta_coeffs.append(wfn.Cb().to_array())

    np.save('alpha_coeffs.npy',alpha_coeffs,allow_pickle=True)
    np.save('beta_coeffs.npy',beta_coeffs,allow_pickle=True)

I ran a rhf scf for some molecules and got the alpha and beta matrices.
I am not sure if symmetry and point group symmetry information is imposed initially per default. If so is there a way to suppress it for the scf calculation? Otherwise, I actually don’t know if they do or how to check with Psi4

And to reformulate what I tried, I used this line: “wfn.Ca().to_array()”
to create the alpha coefficient matrix. I was looking for a way to do this backwards, as you mentioned. I just could not find a method to tranform a matrix to a wfn object, to then inhibit the calculation of the sad guess and use my custom guess instead.

The new matrices have the same dimensions as the ones I got from the scf, if thats what you mean by basisset association.

Kind regards

Are you doing any processing of the orbitals that you’re not showing in your minimal example?

Well, I am trying to train a ML model that predicts those orbitals. So, yes, you could say I apply some random transformation, that changes the values.

Would the approach, that I am trying to find, differ for feeding back in the exact same orbitals that the scf returned vs. feeding back in “randomly” transformed versions of the orbitals?

Best

The procedure is going to be a little different, yes.

Psi only has one mechanism for reading orbitals. If the “GUESS” keyword is set to “READ”, Psi will look for a file containing all the information in a wavefunction. It reads that, converts that back to a wavefunction, and pulls the orbitals from there.

So what you need is to:

  1. Get a wavefunction object
  2. Put your modified orbitals on it.
  3. Save the wavefunction
  4. Make sure that Psi can find it when it comes time to read the guess.

Step 1 is easy. You already know this part.
Step 2 relies on the NumPy interface. scf_wfn.Ca().np[:] = array should work, and likewise for Cb(). This step wouldn’t be necessary if you weren’t modifying the orbitals. To confirm that the modification is working, I recommend printing the orbitals before and after the assignment.
Step 3 is also easy. Call the to_file method of the wavefunction. This takes a single argument, the filename to save.
Step 4 is a touch trickier. The filename that Psi is looking for follows the template scratchdirectory/output.moleculename.pid.180.npy. scratchdirectory, pid and moleculename should be replaced with appropriate values. On my laptop, for instance, /tmp/output.test_molecule.21789.180.npy. It will probably be necessary to move the file you saved in 3 to the correct location, due to the process ID part. If there’s any doubt as to the correct value, this is the function that generates the scratchdirectory/output.moleculename.pid.180 part.

I haven’t tried this myself, but that’s how all the pieces fit together. Please report back with if this works, or if something still isn’t right.

Hey, thanks so much for your guide. I had to prepare more than I thought before I could try this.
After trying a lot I am still getting:

Unable to find file 180, defaulting to SAD guess.

Here are the steps I did for the molecule CH2O:

first_alpha = alphas[0] #alphas shape is (200,38,38)
first_file = test_files[0] # this is a list of the .xyz file names to later iterate over

f = open(dir+test_files[0])
coord_block = ""
for line in f:
	coord_block += line

m = psi4.geometry("\n" +  coord_block
)

m.set_name('my_mol_name')
m_name=m.name()
psi4.set_options({'basis':'def2-svp', 'guess': 'read'})

 1. 
e_scf,wfn = energy('scf',return_wfn=True)

 2. 
wfn.Ca().np[:] = first_alpha
wfn.Cb().np[:] = first_alpha

 4. 
pid=os.getpid()
scratchdir1=core.IOManager.shared_object().get_default_path()

f_name = scratchdir1+'output.'+'{}'.format(m_name)+'.' +'{}'.format(pid)+'.'+str(180)+'.npy'

 3. 
wfn.to_file(f_name) #this is saved in current folder

I set the scratchdir to be the directory where I work in. The output generated in 3. is stored in my scratchdir as:

output.my_mol_name.30591.180.npy

Am I missing something obvious? Also this .npy file is stored in my scratchdir permanently I assume I missing something there, too since, doing it like this I would create hundreds of this .npy files.

Kind regards and thank you!

Could you give me a minimal working example? What you gave me is not quite a working example, as I don’t know where alphas comes from.

Please give me a script that actually runs. Even when I use the supplemental files you emailed me, the script crashes Python with obvious SyntaxError’s. For example, of course " 2." should be commented out.

Hey, sorry I’ve changed the code. This should work now:

import numpy as np
import psi4
import os

dir='path/to/xyz_files'


alphas = np.load('/../exp_alphas.npy')
test_files = ['mol506.xyz']

f = open(dir+test_files[0])
coord_block = ""
for line in f:
	coord_block += line

m = psi4.geometry("\n" +  coord_block
)

m.set_name('my_mol_name')
m_name=m.name()
psi4.set_options({'basis':'def2-svp', 'guess': 
'read'})

 # 1. #
e_scf,wfn = energy('scf',return_wfn=True)

 # 2. #
wfn.Ca().np[:] = first_alpha
wfn.Cb().np[:] = first_alpha

# 4. #
pid=os.getpid()
scratchdir1=core.IOManager.shared_object().get_default_path()

f_name = scratchdir1+'output.'+'{}'.format(m_name)+'.' +'{}'.format(pid)+'.'+str(180)+'.npy'

# 3. #
wfn.to_file(f_name) #this is saved in current folder

Your script still doesn’t run. You haven’t defined first_alpha. Please check that the file you give me actually produces the error you want me to debug.

Yes, your right its just the first element of alphas:

import numpy as np
import psi4
import os

dir='path/to/xyz_files'


alphas = np.load('/../exp_alphas.npy')
test_files = ['mol506.xyz']

f = open(dir+test_files[0])
coord_block = ""
for line in f:
	coord_block += line

m = psi4.geometry("\n" +  coord_block
)

m.set_name('my_mol_name')
m_name=m.name()
psi4.set_options({'basis':'def2-svp', 'guess': 
'read'})

 # 1. #
e_scf,wfn = energy('scf',return_wfn=True)

 # 2. #
wfn.Ca().np[:] = alphas[0]
wfn.Cb().np[:] = alphas[0]

# 4. #
pid=os.getpid()
scratchdir1=core.IOManager.shared_object().get_default_path()

f_name = scratchdir1+'output.'+'{}'.format(m_name)+'.' +'{}'.format(pid)+'.'+str(180)+'.npy'

# 3. #
wfn.to_file(f_name) #this is saved in current folder

After modifying your file to account for the fact that I have both exp_alphas.npy and mol506.xyz in the same directory with the input, I can read in the orbitals after you set them. They are poor orbitals for this geometry, but I can read them.

I’m using the current master version of Psi4. What version of Psi4 are you using?

I’m using version 1.4a1

Could you show me an output file that shows the orbitals not being read in?

Having looked at your file (and in future, please upload .dat files here rather than e-mailing me), Psi4 is working just fine.

Psi4 can’t read orbitals before you write them to a wavefunction. If you have them in a pre-existing wavefunction file, and you want to read that in, you need to move the wavefunction file to the f_name = scratchdir1+'output.'+'{}'.format(m_name)+'.' +'{}'.format(pid)+'.'+str(180)+'.npy' you write to now.

In order to create a wfn object must I not retain this order (psi.geom → setting name/set options → create wfn obj by calling energy function → and then load my guess)?

Also, I am not sure how to pre save the wfn object since the file name has changing variables like the pid.
I thought the above code worked for you, or were these the changes you metioned (“to account for the fact that I have both exp_alphas.npy and mol506.xyz in the same directory with the input”)?

Best

There was a miscommunication. What you told me was that you were getting “Unable to find file 180, defaulting to SAD guess”. To me, it was obvious that the message should occur for the first SCF call, before you’ve saved the orbitals, so I assumed you meant “when I add a second SCF call, I get that error message.” That is why I said “I can read in the orbitals after you set them”.

I don’t know the details of your workflow, but you will need the orbitals on a saved wavefunction object before the SCF call that uses them as a guess. As for how you get that wavefunction, you could use a wavefunction from a previous energy run, even one at a different geometry. Just change the file name so the one that Psi4 is going to expect.

I see. It makes sense to me that the loaded wavefunction has to be ready before the “custom run”.
So, in the example attached I tried to create a wavefunction, load and save it and then run another scf calculation, loading the wavefunction and adapting the name. For the second part I still get “Unable to find file 180, defaulting to SAD guess”. Am I missing something?
3_calc.py (923 Bytes)

That file works just fine for me using the current master version of Psi4, after changing the dir variable to match my file paths.

Perhaps you can try updating Psi4 to the release candidate for 1.4?