Appendix 2: Opt algorithm fmax experiment

We survey the behavior of structural optimization parameter fmax.

[ ]:
!pip install more_itertools
[2]:
from ase import Atoms
from ase.constraints import ExpCellFilter, FixAtoms
from ase.build import bulk, molecule, add_adsorbate, surface
from ase.optimize import FIRE
from ase.units import kB
import pandas as pd
from tqdm.auto import tqdm
import numpy as np
from more_itertools import windowed
import pfp_api_client
from pfp_api_client.pfp.calculators.ase_calculator import ASECalculator
from pfp_api_client.pfp.estimator import Estimator, EstimatorCalcMode
from pfcc_extras.structure.ase_rdkit_converter import smiles_to_atoms, atoms_to_smiles
from pfcc_extras.visualize.view import view_ngl
from matlantis_features.features.common.fire_lbfgs import FIRELBFGS
/home/jovyan/.local/lib/python3.7/site-packages/matlantis_features/features/common/fire_lbfgs.py:12: FutureWarning: matlantis_features.features.common.fire_lbfgs is deprecated; use ase_ext.optimize
  FutureWarning,
[3]:
def max_distance(atoms1: Atoms, atoms2: Atoms) -> float:
    return float(np.max(np.linalg.norm(atoms1.positions - atoms2.positions, axis=1)))
[4]:
calc_mol = ASECalculator(Estimator(calc_mode=EstimatorCalcMode.MOLECULE, model_version="v2.0.0"))
calc = ASECalculator(Estimator(calc_mode=EstimatorCalcMode.CRYSTAL, model_version="v2.0.0"))
calc_d3 = ASECalculator(Estimator(calc_mode=EstimatorCalcMode.CRYSTAL_PLUS_D3, model_version="v2.0.0"))

Example where fmax=0.05eV/A is sufficient

The default value of fmax is 0.05eV/A in ASE, which is accurate enough in many cases.

Let’s check this for molecular and solid examples, respectively.

Example of molecule

[5]:
fmax_list = [0.05, 0.01, 0.001]
[6]:
atoms = molecule("C2H6")
atoms.rattle(stdev=0.1)
atoms.calc = calc_mol
images = []
for fmax in tqdm(fmax_list):
    with FIRE(atoms, logfile=None) as opt:
        opt.run(fmax=fmax)
        images.append(atoms.copy())
[7]:
for atoms in images:
    atoms.calc = calc_mol

results = {
    "fmax": [],
    "energy (eV)": [],
    "distance (A)": [],
}
for (fmax_old, old), (fmax_new, new) in windowed(zip(fmax_list, images), 2):
    de = new.get_potential_energy() - old.get_potential_energy()
    dx = max_distance(new, old)
    results["energy (eV)"].append(de)
    results["distance (A)"].append(dx)
    results["fmax"].append(f"fmax {fmax_old}->{fmax_new}")
pd.DataFrame(results)
[7]:
fmax energy (eV) distance (A)
0 fmax 0.05->0.01 -0.002484 0.030840
1 fmax 0.01->0.001 -0.000402 0.017346

Example of solid

[8]:
fmax_list = [0.05, 0.01, 0.001]
[9]:
atoms = bulk("Pt") * (4, 4, 4)
atoms.rattle(stdev=0.1)
atoms.calc = calc_mol
images = []
for fmax in tqdm(fmax_list):
    with FIRE(atoms, logfile=None) as opt:
        opt.run(fmax=fmax)
        images.append(atoms.copy())
[10]:
for atoms in images:
    atoms.calc = calc_mol

results = {
    "fmax": [],
    "energy (eV)": [],
    "distance (A)": [],
}
for (fmax_old, old), (fmax_new, new) in windowed(zip(fmax_list, images), 2):
    de = new.get_potential_energy() - old.get_potential_energy()
    dx = max_distance(new, old)
    results["energy (eV)"].append(de)
    results["distance (A)"].append(dx)
    results["fmax"].append(f"fmax {fmax_old}->{fmax_new}")
pd.DataFrame(results)
[10]:
fmax energy (eV) distance (A)
0 fmax 0.05->0.01 -0.005323 0.016975
1 fmax 0.01->0.001 -0.001351 0.013533

In both cases, even the most mobile atom moved only about 0.01A and its energy changed only about 0.01eV. The energy difference of 0.01 eV is about 1.5 times the difference in existence ratio at 300 K. If you want to search for materials at room temperature, this value may be enough for the precision you want in your numerical calculations.

In addition, the chemical accuracy, which is considered to correspond to the accuracy of experimental chemistry, is about 0.04 eV, but many numerical methods do not reach that accuracy, and in this sense, this accuracy is probably not necessary in many cases.

The existence ratio at 300 K of two structures with an energy difference of 0.01 eV is obtained from the Boltzmann factor as follows

[11]:
np.exp(0.01 / (300 * kB))
[11]:
1.4722876299701386

Example where fmax=0.05eV is insufficient

Examples of lack of precision with respect to energy

This example is a little tricky, but let us consider an example where 2,3-Dimethyl-2-butene is loosely adsorbed on the Pt surface. In such an example, due to the gentle force acting on the molecule, the structural optimization stops halfway at fmax=0.05eV/A, resulting in a large energy difference compared to a further optimization. Since this kind of situation could occur, it would be better to use the smallest fmax possible for a more reliable structural optimization.

[12]:
bulk1111 = bulk("Pt")
bulk1111.calc = calc_d3
with FIRELBFGS(ExpCellFilter(bulk1111), logfile=None) as opt:
    opt.run(0.0001)
[13]:
atoms = surface(bulk1111, (1, 1, 1), 4, vacuum=20.0) * (4, 4, 1)
c = atoms.cell[2, 2] / 2
atoms.constraints = [FixAtoms(mask=[atom.position[2] < c for atom in atoms])]
atoms.rattle(stdev=0.1)
ads = smiles_to_atoms("CC(=C(C)C)C")
v = (ads.positions[0] - ads.positions[1]) / 2
p = (atoms.cell[0, :2] + atoms.cell[1, :2]) / 2
add_adsorbate(atoms, ads, 4.0, position=tuple(p))
atoms.positions[64:] = atoms.positions[64:] + v
atoms.calc = calc_d3
[14]:
view_ngl(atoms, representations=["ball+stick"])
[15]:
fmax_list = [0.05, 0.01, 0.001]
[16]:
images = []
for fmax in tqdm(fmax_list):
    with FIRELBFGS(atoms, logfile=None) as opt:
        opt.run(fmax=fmax)
        images.append(atoms.copy())
[17]:
for atoms in images:
    atoms.calc = calc_mol

results = {
    "fmax": [],
    "energy (eV)": [],
    "distance (A)": [],
}
for (fmax_old, old), (fmax_new, new) in windowed(zip(fmax_list, images), 2):
    de = new.get_potential_energy() - old.get_potential_energy()
    dx = max_distance(new, old)
    results["energy (eV)"].append(de)
    results["distance (A)"].append(dx)
    results["fmax"].append(f"fmax {fmax_old}->{fmax_new}")
pd.DataFrame(results)
[17]:
fmax energy (eV) distance (A)
0 fmax 0.05->0.01 -0.213606 4.068967
1 fmax 0.01->0.001 -0.987591 2.578896
[18]:
view_ngl(images, representations=["ball+stick"], replace_structure=True)

Example of lack of precision with respect to structure (1)

In the example of ethylene adsorbed on a Pt(111) surface, the structural optimization stops halfway at fmax=0.05eV.

[19]:
bulk1111 = bulk("Pt")
bulk1111.calc = calc
with FIRELBFGS(ExpCellFilter(bulk1111), logfile=None) as opt:
    opt.run(0.0001)
[20]:
atoms = surface(bulk1111, (1, 1, 1), 4, vacuum=20.0) * (4, 4, 1)
c = atoms.cell[2, 2] / 2
atoms.constraints = [FixAtoms(mask=[atom.position[2] < c for atom in atoms])]
atoms.calc = calc
atoms.rattle(stdev=0.1)
add_adsorbate(atoms, smiles_to_atoms("C=C"), 2.0, position=tuple((atoms.cell[0, :2] + atoms.cell[1, :2]) / 2))
[21]:
view_ngl(atoms, representations=["ball+stick"])
[22]:
images = []
for fmax in tqdm(fmax_list):
    with FIRELBFGS(atoms, logfile=None) as opt:
        opt.run(fmax=fmax)
        images.append(atoms.copy())
[23]:
for atoms in images:
    atoms.calc = calc_mol

results = {
    "fmax": [],
    "energy (eV)": [],
    "distance (A)": [],
}
for (fmax_old, old), (fmax_new, new) in windowed(zip(fmax_list, images), 2):
    de = new.get_potential_energy() - old.get_potential_energy()
    dx = max_distance(new, old)
    results["energy (eV)"].append(de)
    results["distance (A)"].append(dx)
    results["fmax"].append(f"fmax {fmax_old}->{fmax_new}")
pd.DataFrame(results)
[23]:
fmax energy (eV) distance (A)
0 fmax 0.05->0.01 -0.020390 0.285722
1 fmax 0.01->0.001 -0.001203 0.083779
[24]:
view_ngl(images, representations=["ball+stick"], replace_structure=True)

There is an obvious difference between the structure optimized halfway and one that fully relaxed. However, the energy difference is not so big in this example, so we may decide to ignore the difference in some cases.

Example of lack of precision with respect to structure (2)

Let us try to rotate the methyl group of toluene, which is well known as a difficult system to optimize the structure.

[25]:
atoms = smiles_to_atoms("Cc1ccccc1")
tmp = atoms[7:10]
tmp.rotate([1.0, 0.0, 0.0], 15.0)
atoms.positions[7:10] = tmp.positions
atoms.calc = calc_mol
[26]:
view_ngl(atoms, representations=["ball+stick"])
[27]:
images = []
for fmax in tqdm(fmax_list):
    with FIRELBFGS(atoms, logfile=None) as opt:
        opt.run(fmax=fmax)
        images.append(atoms.copy())
[28]:
for atoms in images:
    atoms.calc = calc_mol

results = {
    "fmax": [],
    "energy (eV)": [],
    "distance (A)": [],
}
for (fmax_old, old), (fmax_new, new) in windowed(zip(fmax_list, images), 2):
    de = new.get_potential_energy() - old.get_potential_energy()
    dx = max_distance(new, old)
    results["energy (eV)"].append(de)
    results["distance (A)"].append(dx)
    results["fmax"].append(f"fmax {fmax_old}->{fmax_new}")
pd.DataFrame(results)
[28]:
fmax energy (eV) distance (A)
0 fmax 0.05->0.01 -0.001788 0.028956
1 fmax 0.01->0.001 -0.001723 0.260170
[29]:
view_ngl(images, representations=["ball+stick"], replace_structure=True)

You can see that the rotation of the methyl group of toluene does not change the energy difference much while the angle is changed. In such an example, we need to use a small fmax to predict the most stable structure. The energy difference of 0.002 eV has only an effect of 1.08 times the existence ratio at 300 K, so it is safe to say that it has almost no effect. Therefore, such a small energy difference can be ignored in some cases. However, if you want to perform an analysis where the energy is assumed to be extremely small, such as vibration analysis, it may have a negative impact on the analysis.

[30]:
np.exp(0.002 / (300 * kB))
[30]:
1.080434721876578