Quantum chemistry calculation in Python (Psi4)

Introduction

What is Psi4

It is an open source quantum chemistry calculation software that can be run using python.

About this article

How to use (mainly code) is New form of chemistry, Is your order a lead compound? was used as a reference. Thank you very much. (The above HP has more information on how to use it.)

What I did (purpose)

** Structural optimization calculation ** was performed on aspirin (acetylsalicylic acid) to obtain the following information.

--Structural information of optimized structure (coordinates of each atom)

Preparation

Module installation

conda install psi4 psi4-rt python=3.7 -c psi4
conda install -c rdkit rdkit

Code (up to structural optimization)

Module import

import numpy as np
import pandas as pd
from rdkit import Chem
from rdkit.Chem.rdDistGeom import ETKDGv3, EmbedMolecule
from rdkit.Chem.rdForceFieldHelpers import MMFFHasAllMoleculeParams, MMFFOptimizeMolecule
from rdkit.Chem.Draw import IPythonConsole
import psi4

import datetime
import time

From calculation setting to structural optimization

%%time
#Take a look at the calculation time

#Hardware side settings (CPU thread number and memory settings used for calculation)
psi4.set_num_threads(nthread=1)
psi4.set_memory("1GB")

#Molecules to enter (acetylsalicylic acid)
smiles = 'CC(=O)Oc1ccccc1C(=O)O'

#Decide the file name
t = datetime.datetime.fromtimestamp(time.time())
psi4.set_output_file("{}{}{}_{}{}_{}.log".format(t.year,
                                              t.month,
                                              t.day,
                                              t.hour,
                                              t.minute,
                                                smiles))


#Coarse 3D structure optimization by generating 3D structure from SMILES
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)
params = ETKDGv3()
params.randomSeed = 1
EmbedMolecule(mol, params)

#Structural optimization with MMFF (Merck Molecular Force Field)
MMFFOptimizeMolecule(mol)
#UFF (Universal Force Field) If you want to optimize the structure in a universal force field
#UFFOptimizeMolecule(mol)

conf = mol.GetConformer()


#Convert to a format that can be input to Psi4.
#Set charge and spin multiplicity (below is charge 0, spin multiplicity 1)
mol_input = "0 1"

#Describe the coordinates of each atom in XYZ format
for atom in mol.GetAtoms():
    mol_input += "\n " + atom.GetSymbol() + " " + str(conf.GetAtomPosition(atom.GetIdx()).x)\
    + " " + str(conf.GetAtomPosition(atom.GetIdx()).y)\
    + " " + str(conf.GetAtomPosition(atom.GetIdx()).z)
    
molecule = psi4.geometry(mol_input)

#Set calculation method (functional) and basis set
level = "b3lyp/6-31G*"

#Calculation method (functional), example of basis set
#theory = ['hf', 'b3lyp']
#basis_set = ['sto-3g', '3-21G', '6-31G(d)', '6-31+G(d,p)', '6-311++G(2d,p)']

#Perform structural optimization calculations
energy, wave_function = psi4.optimize(level, molecule=molecule, return_wfn=True)

'''output
Optimizer: Optimization complete!
CPU times: user 15min 50s, sys: 27.5 s, total: 16min 17s
Wall time: 17min 18s
'''

Using 1 thread and 1GB of memory, it took a total of 17 minutes to calculate one molecule.

By the way, the input file (XYZ format) looks like this 0 1 C 2.974609598387524 0.6502198947117926 -1.4939659177438849 C 1.5230890246254225 0.5053737824529224 -1.1530242528515486 O 0.603330778130228 1.057538083744091 -1.7483070198697948 O 1.387914769521494 -0.30271498078404824 -0.03263846326379664 C 0.054417291055592946 -0.5712371294324196 0.2945337836815555 C -0.4224582203026945 -1.8473553693968503 -0.01601581679481116 C -1.7257359235711085 -2.203041426185137 0.3353284417639963 C -2.5433749515038557 -1.2955508196005052 1.0118502341281146 C -2.060774394293738 -0.026729946932774716 1.3386540927479436 C -0.7608195905359867 0.34046831642295106 0.9729716175553647 C -0.29830011960960084 1.6797421182462917 1.3678608910543 O -0.14779504472207144 2.009190935828314 2.530629328049564 O -0.14453966540626556 2.510886086741464 0.31763612590410073 H 3.485468003428318 1.2068580777799764 -0.7047280552402798 H 3.4260440775599967 -0.33672600434994854 -1.6236316858587398 H 3.071579924882846 1.198336303548168 -2.4353976902946775 H 0.21564436873480777 -2.565765313412947 -0.5236197408913651 H -2.1022344085524307 -3.194413587471996 0.09265503994624906 H -3.5541301454100207 -1.5811057139723368 1.2953607541830348 H -2.6935167212816706 0.6706025040826632 1.8845251711162185 H -0.28841865113707016 2.0954241879804125 -0.5663389837709468

Code (getting information from the optimized structure)

Structural information (XYZ format)

print(molecule.save_string_xyz())
#The unit is angstrom
'''
0 1
 C    3.098884439155    0.286778541532   -1.818971930267
 C    1.631416601050    0.255588755108   -1.492773907603
 O    0.755324426944    0.834934357982   -2.102478251457
 O    1.396466682379   -0.489685966862   -0.380657317744
 C    0.074752427021   -0.810467114907   -0.015892403575
 C   -0.343553415622   -2.104118243082   -0.317545006753
 C   -1.590020749705   -2.548730143826    0.117945091675
 C   -2.402170016296   -1.705115088782    0.878418776055
 C   -1.963835487350   -0.422144280017    1.191561137714
 C   -0.729128903046    0.059858926344    0.731238467127
 C   -0.357527633804    1.464587591984    1.139219316977
 O   -0.521525170787    1.863006769904    2.268651582597
 O    0.121812264509    2.275304342291    0.176882785876
 H    3.636482218213    0.795645992511   -1.011210543547
 H    3.498253279532   -0.729173607712   -1.889712382197
 H    3.251358031652    0.823578110904   -2.755331437125
 H    0.322818412683   -2.753052223004   -0.876772337761
 H   -1.916523030934   -3.555688162495   -0.124828509423
 H   -3.368180133191   -2.049556108518    1.235066217610
 H   -2.570453292820    0.238058474783    1.802873381996
 H    0.073192650040    1.843321147028   -0.703222197850
'''

Energy, HOMO, LUMO energy

#The unit is atomic unit (a.u.Or Hartrees)

#energy
print(round(energy,3),'a.u.')
# >>> -648.689 a.u.

#Display HOMO (Unit: au= Hartree)
LUMO_idx = wave_function.nalpha()
HOMO_idx = LUMO_idx - 1

homo = wave_function.epsilon_a_subset("AO", "ALL").np[HOMO_idx]
lumo = wave_function.epsilon_a_subset("AO", "ALL").np[LUMO_idx]

print('homo:',round(homo,5),' a.u.')
print('lumo:',round(lumo,5),' a.u.')
# >>> homo: -0.26022 a.u.
# >>> lumo: -0.04309 a.u.

Mulliken charge

psi4.oeprop(wave_function, 'MULLIKEN_CHARGES') 
mulliken = np.array(wave_function.atomic_point_charges()) 

for i, atom in enumerate(mol.GetAtoms()):
    print(atom.GetSymbol(),round(mulliken[i],4))
    
#Try to visualize with a similarity map
from rdkit.Chem.Draw import SimilarityMaps 
fig = SimilarityMaps.GetSimilarityMapFromWeights(mol, 
                                                 mulliken, 
                                                 colorMap='RdBu')
'''output
C -0.527
C 0.6015
O -0.4756
O -0.5025
C 0.2962
C -0.147
C -0.1334
C -0.1206
C -0.1705
C 0.0273
C 0.5345
O -0.4395
O -0.5899
H 0.2051
H 0.1907
H 0.1951
H 0.1468
H 0.1476
H 0.1462
H 0.1705
H 0.4445
'''

image.png

Since Mulliken density analysis assumes that the smallest basis set such as STO-3G is used for the basis set, ** an incorrect charge distribution may be obtained when using a large basis set *. * And that.

Dipole moment

From each coordinate component of the dipole moment, the magnitude of the dipole moment of the entire molecule is obtained.


dipole_x, dipole_y, dipole_z = psi4.variable('SCF DIPOLE X'), psi4.variable('SCF DIPOLE Y'), psi4.variable('SCF DIPOLE Z') 
dipole_moment = np.sqrt(dipole_x ** 2 + dipole_y ** 2 + dipole_z ** 2)

#The unit is Debye(D,debye)
print(round(dipole_moment,3),'D')
# >>> 5.175 D

About calculation method

-** HF method (SCF method) **: Molecular orbital method that solves the electronic state, which is the solution of the Schrödinger equation, for the approximate wave function. --Since we are dealing with electrons moving in an averaged electron cloud, we are underestimating the repulsion between the electrons. (The difference between the electronic state solved by the approximate wave function of the HF method and the actual value is called ** electron correlation ) - DFT calculation **: Solve the equivalent Kohn-Sham equation instead of the Schrodinger equation --The exchange interaction and the electron correlation interaction, which are not included in the HF method, are partially considered for the exact solution. ――The calculation can be overwhelmingly fast, but the choice of functional has a great effect on the calculation result. --B3LYP is often used (because there is little deviation from the experimental values)

About basis functions

What is a basis function?

** What size and shape of the electron and in what orbit "is expressed by a mathematical formula ** Example) 6-31G: It means that the inner shell orbital is divided into 6 Gauss type functions, and the valence orbital is divided into 3 and 1 Gauss type functions on the inside and outside.

How to choose basis functions (Reference)

There is no universal basis set system in all systems, and it is important to select a function system that suits the purpose each time according to the calculation purpose.

--If it is a general organic compound, first select it based on 6-31 (d). (It seems that good results will be obtained for the calculation time) --If the molecular weight is 100 or less, the combination of B3LYP and 6-31G (d) will give results within a practical calculation time.

About auxiliary functions (** d ** part of 6-31G (d))

-** d **, ** p : Polarization function (considering the effects of intramolecular bond formation and polarization of electrostatic interactions between ions) - d : Apply polarization function to atoms other than hydrogen atom - p : Apply polarization function to hydrogen atom - + , ** ++ : diffuse function (considering the spread of electron distribution) - + : Apply diffuse function to atoms heavier than H, He -++: Apply the diffuse function to light elements (H, He)

Proper use of auxiliary functions (** d ** part of 6-31G (d))

--General organic compounds: 6-31G (d) (polarization function: d added) --Negatively charged chemical species (anion), excited state: 6-31 + G (d) (diffuse function: + added) --Chemical species with positive charge (cation): 6-31G (d, p) (polarization function: dp added)

reference

  1. Introduction to Computational Chemistry with python and psi4 (New form of chemistry)
  2. Charge in Computational Chemistry: Electron Density Analysis Using psi4 (New Form of Chemistry)
  3. Learn the basics of structural optimization in computational chemistry with psi4 (a new form of chemistry)
  4. Quantum chemistry calculation in Python (Is your order a lead compound?)
  5. psi4 official document
  6. psi4 official youtube
  7. Which basis function should I choose? (PC CHEM BASIS CS.COM)
  8. What is HF/3-21G? (PC CHEM BASISCS.COM)
  9. What is B3LYP/6-31G? (PC CHEM BASISCS.COM)

Recommended Posts

Quantum chemistry calculation in Python (Psi4)
Chemistry in Python
Date calculation in python
Date calculation in Python
Shapley value calculation in Python
Accelerometer Alan Variance Calculation in Python
Let's make a combination calculation in Python
Time comparison: Correlation coefficient calculation in Python
Quadtree in Python --2
Python in optimization
CURL in python
Geocoding in python
SendKeys in Python
Meta-analysis in Python
Unittest in python
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Plink in Python
Constant in python
Lifegame in Python.
FizzBuzz in Python
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Reflection in Python
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Quad-tree in Python
Reflection in Python
Hashable in python
DirectLiNGAM in Python
LiNGAM in Python
Flatten in python
flatten in python
Calculation result after the decimal point in Python
Sorted list in Python
Daily AtCoder # 36 in Python
Clustering text in Python
Daily AtCoder # 2 in Python
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Daily AtCoder # 18 in Python
Edit fonts in Python
Singleton pattern in Python
Read DXF in python
Daily AtCoder # 53 in Python
Use config.ini in Python