Memory/Core effect on reproducing calculations to higher precision

johal345 · March 27, 2023, 2:29pm

Hi,

This post is kind of a continuation of my previous post: Psi4 cartesian coordinates precision from same starting point geometry however highlighting/ summarising the same behaviour related to memory rather than number of cores.

In the previous posted I noticed if I ran an optimisation calculation (with the same memory allotment) with 1 core I got the same final geometries exactly to all recorded decimal places, whereas running with >1 core I get final geometries that differed slightly (~10^-5 Å rmsd from overlaying 2 output xyz’s for the molecules I’m working with however when looking at individual coordinates in some cases I could see a difference in the 3rd decimal place and therefore couldn’t safely round to make the same to get perfect matches.)

I now noticed that I could get similar differences when changing the amount of memory set for a calculation.

For example:
With a test ethane molecule using variations of the below code.
ethane.xyz (429 Bytes)

import psi4
psi4.set_output_file(f"ethane_opt_1core_1000mb_1.log")

psi4.core.set_num_threads(1)

memory_total = "1000MB"
psi4.set_memory(memory_total)

psi4.set_options({"basis":"6-311G**"})

with open("ethane.xyz","r") as r:
    xyz = r.read()

mol = psi4.core.Molecule.from_string(xyz, fix_com=True, fix_orientation=True)

energy, wfn = psi4.optimize("B3LYP",molecule=mol, return_wfn=True)
psi4.core.clean()

print(energy)
mol.save_xyz_file('ethane_opt_1core_1000mb_1.xyz',1)

Using the above code and changing the memory from 1000MB I noticed that for settings above 500MB I get the same repeated answer and for values below 500MB I always get the same answer when for both methods it says its performed Algorithm: Core. I’ve noticed similar behaviour between Core and Disk as well as between Disk and Disk based upon amount of memory. I’ve summarised this in the below table:

NAMES	                ENERGIES		       RMSD	       Linux Diff Command
1core_1000mb_1	        -79.85628285596978		2.26E-16	No difference
1core_1000mb_2	        -79.85628285596978			
				
1core_4000mb_1	        -79.85628285596978		2.26E-16	No difference
1core_4000mb_2	        -79.85628285596978

1core_263mb_1	        -79.85628285554220      2.26E-16	No difference
1core_263mb_2	        -79.85628285554220
							
1core_=<500mb	        -79.85628285554220		1.56E-10	Difference
1core_>500mb	        -79.85628285596978			
				
2core_1000mb_1	        -79.85628285859510		8.26E-10	Difference
2core_1000mb_2	        -79.85628285942450

1core = used 1 core, 2core = used 2 cores for the calc. Xmb is the amount of memory used. _1 or _2 is for the 2 repeats of the calculation.

The energies meet the optimisation and single point tolerances and the geometries likewise meet the Disp convergence targets so that’s fine but the fact that they differ in the first place based upon differing memory allotment or changing the number of cores set I’m curious about. Is this known/ intended behaviour and if so could you potentially give me a rough idea why this happens?

Thank you
Jay

loriab · March 27, 2023, 3:21pm

Any chance the number of geom opt iterations is equal or differing by 1 in your matching/non-matching cases?

johal345 · March 27, 2023, 3:42pm

Hi,

In all cases the geometries optimise in 4 steps for the ethane molecule.

Attached below are some of the log files for the examples I mentioned:
ethane_opt_1core_500mb_2.txt (109.7 KB)
ethane_opt_1core_500mb_1.txt (109.7 KB)

ethane_opt_1core_1gb_2.txt (109.7 KB)
ethane_opt_1core_1gb_1.txt (109.7 KB)

ethane_opt_2core_1000mb_2.txt (109.7 KB)
ethane_opt_2core_1000mb_1.txt (109.7 KB)

ethane_opt_2core_2000mb_2.txt (109.7 KB)
ethane_opt_2core_2000mb_1.txt (109.7 KB)