It is an open source quantum chemistry calculation software that can be run using python.
How to use (mainly code) is New form of chemistry, Is your order a lead compound? was used as a reference. Thank you very much. (The above HP has more information on how to use it.)
** Structural optimization calculation ** was performed on aspirin (acetylsalicylic acid) to obtain the following information.
--Structural information of optimized structure (coordinates of each atom)
conda install psi4 psi4-rt python=3.7 -c psi4
conda install -c rdkit rdkit
import numpy as np
import pandas as pd
from rdkit import Chem
from rdkit.Chem.rdDistGeom import ETKDGv3, EmbedMolecule
from rdkit.Chem.rdForceFieldHelpers import MMFFHasAllMoleculeParams, MMFFOptimizeMolecule
from rdkit.Chem.Draw import IPythonConsole
import psi4
import datetime
import time
%%time
#Take a look at the calculation time
#Hardware side settings (CPU thread number and memory settings used for calculation)
psi4.set_num_threads(nthread=1)
psi4.set_memory("1GB")
#Molecules to enter (acetylsalicylic acid)
smiles = 'CC(=O)Oc1ccccc1C(=O)O'
#Decide the file name
t = datetime.datetime.fromtimestamp(time.time())
psi4.set_output_file("{}{}{}_{}{}_{}.log".format(t.year,
t.month,
t.day,
t.hour,
t.minute,
smiles))
#Coarse 3D structure optimization by generating 3D structure from SMILES
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)
params = ETKDGv3()
params.randomSeed = 1
EmbedMolecule(mol, params)
#Structural optimization with MMFF (Merck Molecular Force Field)
MMFFOptimizeMolecule(mol)
#UFF (Universal Force Field) If you want to optimize the structure in a universal force field
#UFFOptimizeMolecule(mol)
conf = mol.GetConformer()
#Convert to a format that can be input to Psi4.
#Set charge and spin multiplicity (below is charge 0, spin multiplicity 1)
mol_input = "0 1"
#Describe the coordinates of each atom in XYZ format
for atom in mol.GetAtoms():
mol_input += "\n " + atom.GetSymbol() + " " + str(conf.GetAtomPosition(atom.GetIdx()).x)\
+ " " + str(conf.GetAtomPosition(atom.GetIdx()).y)\
+ " " + str(conf.GetAtomPosition(atom.GetIdx()).z)
molecule = psi4.geometry(mol_input)
#Set calculation method (functional) and basis set
level = "b3lyp/6-31G*"
#Calculation method (functional), example of basis set
#theory = ['hf', 'b3lyp']
#basis_set = ['sto-3g', '3-21G', '6-31G(d)', '6-31+G(d,p)', '6-311++G(2d,p)']
#Perform structural optimization calculations
energy, wave_function = psi4.optimize(level, molecule=molecule, return_wfn=True)
'''output
Optimizer: Optimization complete!
CPU times: user 15min 50s, sys: 27.5 s, total: 16min 17s
Wall time: 17min 18s
'''
Using 1 thread and 1GB of memory, it took a total of 17 minutes to calculate one molecule.
print(molecule.save_string_xyz())
#The unit is angstrom
'''
0 1
C 3.098884439155 0.286778541532 -1.818971930267
C 1.631416601050 0.255588755108 -1.492773907603
O 0.755324426944 0.834934357982 -2.102478251457
O 1.396466682379 -0.489685966862 -0.380657317744
C 0.074752427021 -0.810467114907 -0.015892403575
C -0.343553415622 -2.104118243082 -0.317545006753
C -1.590020749705 -2.548730143826 0.117945091675
C -2.402170016296 -1.705115088782 0.878418776055
C -1.963835487350 -0.422144280017 1.191561137714
C -0.729128903046 0.059858926344 0.731238467127
C -0.357527633804 1.464587591984 1.139219316977
O -0.521525170787 1.863006769904 2.268651582597
O 0.121812264509 2.275304342291 0.176882785876
H 3.636482218213 0.795645992511 -1.011210543547
H 3.498253279532 -0.729173607712 -1.889712382197
H 3.251358031652 0.823578110904 -2.755331437125
H 0.322818412683 -2.753052223004 -0.876772337761
H -1.916523030934 -3.555688162495 -0.124828509423
H -3.368180133191 -2.049556108518 1.235066217610
H -2.570453292820 0.238058474783 1.802873381996
H 0.073192650040 1.843321147028 -0.703222197850
'''
#The unit is atomic unit (a.u.Or Hartrees)
#energy
print(round(energy,3),'a.u.')
# >>> -648.689 a.u.
#Display HOMO (Unit: au= Hartree)
LUMO_idx = wave_function.nalpha()
HOMO_idx = LUMO_idx - 1
homo = wave_function.epsilon_a_subset("AO", "ALL").np[HOMO_idx]
lumo = wave_function.epsilon_a_subset("AO", "ALL").np[LUMO_idx]
print('homo:',round(homo,5),' a.u.')
print('lumo:',round(lumo,5),' a.u.')
# >>> homo: -0.26022 a.u.
# >>> lumo: -0.04309 a.u.
psi4.oeprop(wave_function, 'MULLIKEN_CHARGES')
mulliken = np.array(wave_function.atomic_point_charges())
for i, atom in enumerate(mol.GetAtoms()):
print(atom.GetSymbol(),round(mulliken[i],4))
#Try to visualize with a similarity map
from rdkit.Chem.Draw import SimilarityMaps
fig = SimilarityMaps.GetSimilarityMapFromWeights(mol,
mulliken,
colorMap='RdBu')
'''output
C -0.527
C 0.6015
O -0.4756
O -0.5025
C 0.2962
C -0.147
C -0.1334
C -0.1206
C -0.1705
C 0.0273
C 0.5345
O -0.4395
O -0.5899
H 0.2051
H 0.1907
H 0.1951
H 0.1468
H 0.1476
H 0.1462
H 0.1705
H 0.4445
'''
Since Mulliken density analysis assumes that the smallest basis set such as STO-3G is used for the basis set, ** an incorrect charge distribution may be obtained when using a large basis set *. * And that.
From each coordinate component of the dipole moment, the magnitude of the dipole moment of the entire molecule is obtained.
dipole_x, dipole_y, dipole_z = psi4.variable('SCF DIPOLE X'), psi4.variable('SCF DIPOLE Y'), psi4.variable('SCF DIPOLE Z')
dipole_moment = np.sqrt(dipole_x ** 2 + dipole_y ** 2 + dipole_z ** 2)
#The unit is Debye(D,debye)
print(round(dipole_moment,3),'D')
# >>> 5.175 D
-** HF method (SCF method) **: Molecular orbital method that solves the electronic state, which is the solution of the Schrödinger equation, for the approximate wave function. --Since we are dealing with electrons moving in an averaged electron cloud, we are underestimating the repulsion between the electrons. (The difference between the electronic state solved by the approximate wave function of the HF method and the actual value is called ** electron correlation ) - DFT calculation **: Solve the equivalent Kohn-Sham equation instead of the Schrodinger equation --The exchange interaction and the electron correlation interaction, which are not included in the HF method, are partially considered for the exact solution. ――The calculation can be overwhelmingly fast, but the choice of functional has a great effect on the calculation result. --B3LYP is often used (because there is little deviation from the experimental values)
** What size and shape of the electron and in what orbit "is expressed by a mathematical formula ** Example) 6-31G: It means that the inner shell orbital is divided into 6 Gauss type functions, and the valence orbital is divided into 3 and 1 Gauss type functions on the inside and outside.
There is no universal basis set system in all systems, and it is important to select a function system that suits the purpose each time according to the calculation purpose.
--If it is a general organic compound, first select it based on 6-31 (d). (It seems that good results will be obtained for the calculation time) --If the molecular weight is 100 or less, the combination of B3LYP and 6-31G (d) will give results within a practical calculation time.
-** d **, ** p : Polarization function (considering the effects of intramolecular bond formation and polarization of electrostatic interactions between ions) - d : Apply polarization function to atoms other than hydrogen atom - p : Apply polarization function to hydrogen atom - + , ** ++ : diffuse function (considering the spread of electron distribution) - + : Apply diffuse function to atoms heavier than H, He -++: Apply the diffuse function to light elements (H, He)
--General organic compounds: 6-31G (d) (polarization function: d added) --Negatively charged chemical species (anion), excited state: 6-31 + G (d) (diffuse function: + added) --Chemical species with positive charge (cation): 6-31G (d, p) (polarization function: dp added)
Recommended Posts