(2014.02.12 update Added about reinstallation of Boost and installation of PIL) (Updated on February 19, 2014 Added structural formula drawing using Cairo) (Updated on March 11, 2014) Updated structural formula drawing using Cairo)
As a tool for cheminformatics OpenBabel (C ++) and CDK (Java) are famous as open source, RDKit can be used in Python, so you only need to write a relatively simple script It is convenient because you can draw, search, and analyze chemical structural formulas.
When using Homebrew Python, don't forget to put it in your PATH with .bash_profile etc.
If NumPy is not included, enter it.
pip install numpy
Volunteer Homebrew formura is open to the public, so tap it and install it. https://github.com/edc/homebrew-rdkit
brew tap edc/homebrew-rdkit
brew install rdkit
Dependent modules cmake, wget, swig, boost, and rdkit are installed. It takes some time to install boost and rdkit.
If you are using Homebrew Python, you will get a Fatal Python error due to Boost. Build Boost from source and reinstall it with the following command.
brew uninstall boost
brew install boost --build-from-source
From the command line from rdkit import Chem If you can, the installation is successful.
PIL is required to display structural formula images.
pip install pillow
For example, the following code draws a chemical structural formula from SMILES and outputs it as PNG.
rdkittest.py
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem import rdDepictor
mol = Chem.MolFromSmiles('CCC(CC)O[C@@H]1C=C(C[C@@H]([C@H]1NC(=O)C)[NH3+])C(=O)OCC')
rdDepictor.Compute2DCoords(mol)
Draw.MolToFile(mol, 'mol.png')
result:
The code is overwhelmingly shorter than the CDK. It seems that functions with a large amount of calculation such as drawing, searching, and analysis are implemented in C ++. I don't think it's too slow.
However, as you can see, the image quality at this stage is very poor compared to CDK. So, in my case, I search and analyze with RDKit, and only draw is thrown to CDK.
If Cairo and PyCairo are available (that is, they can be imported cairo), the image quality will be significantly improved because Cairo will be used automatically when drawing the structural formula with Draw.MolToFile ().
Both Cairo and PyCairo can be installed with Homebrew.
brew install cairo
brew install py2cairo
(For Python2.7, it's py2cairo instead of pycairo.) When Cairo is installed with Homebrew, it conflicts with the X11 default Cairo, so I think it is probably necessary to set the library path.
(Added on 2014.03.11) Install pango and pygtk.
brew install pango
brew install pygtk
It's OK if you can import pango from Python. This will improve the font and display atomic number subscripts and ion superscripts normally.
Recommended Posts