[PYTHON] Let's summarize various implementation codes of GCN by compounds
Introduction
I implemented GCN based on the paper "Convolutional Networks on Graphs for Learning Molecular Fingerprints" (https://arxiv.org/abs/1509.09292) by David Duvenaud et al., Which was the forerunner of GCN by compounds, and tried to play with it. , The original treatise was developed in a minor library called Autograd, and it was difficult to modify it as it is, so a memo when I examined various other open source implementation code
Examined implementation code
--Implementation of the author of the treatise (original)
- DeepChem
- Chainer-Chemistory
- OpenChem
Where to look
What I investigated this time is not the coding of atoms and bonds, but the part of how to build a neural network.
Implementation of the author of the treatise (original)
--DL framework used: Autograd
- URL: https://github.com/HIPS/neural-fingerprint
--GCN implementation code: https://github.com/HIPS/neural-fingerprint/blob/master/neuralfingerprint/build_convnet.py
--Comments (differences from the implementation of the treatise, etc.):
The code was written by the author of the treatise, so of course it has the same implementation as the treatise.
DeepChem
--DL framework used: TensorFlow
- URL: https://deepchem.io/
--GCN implementation code:
- https://github.com/deepchem/deepchem/blob/master/deepchem/models/graph_models.py
GraphConvModel and _GraphConvKerasModel classes
--Comments (differences from the implementation of the treatise, etc.):
--In the original paper, the output of each layer is added at
return sum (all_layer_fps), atom_activations, array_rep
, but this implementation uses only the output of the final layer.
--At the end of each layer of the convolution, there is a pooling operation with surrounding nodes, but it is not in the original paper.
Chainer-Chemistory
--DL framework used: Chainer-Chemistory
- URL: https://github.com/chainer/chainer-chemistry/
--GCN implementation code:
around here.
- https://github.com/chainer/chainer-chemistry/blob/master/chainer_chemistry/models/nfp.py
- https://github.com/chainer/chainer-chemistry/blob/efe323aa21f63a815130d673781e7cca1ccb72d2/chainer_chemistry/links/update/nfp_update.py#L9
- https://github.com/chainer/chainer-chemistry/blob/efe323aa21f63a815130d673781e7cca1ccb72d2/chainer_chemistry/links/readout/nfp_readout.py#L7
--Comments (differences from the implementation of the treatise, etc.):
--In the original paper, the output of each layer is added at
return sum (all_layer_fps), atom_activations, array_rep
, but in this implementation, the output of each layer is concatinated.
――I feel that this implementation is the most faithful implementation of the original treatise.
OpenChem
--DL framework used: PyTorch
- URL: https://github.com/Mariewelt/OpenChem/
--GCN implementation code:
around here.
- https://github.com/Mariewelt/OpenChem/blob/master/openchem/modules/encoders/gcn_encoder.py
- https://github.com/Mariewelt/OpenChem/blob/9e2c98040377cfa0ba4ebf1df159b2f638c6fe7c/openchem/layers/gcn.py#L11
--Comments (differences from the implementation of the treatise, etc.):
――I feel that this implementation is based on the implementation of another treatise. Perhaps because of this, there are some operations not found in the original paper, such as taking max in each layer.
--Similar to DeepChem, in the original paper, the output of each layer is added at `` `return sum (all_layer_fps), atom_activations, array_rep```, but this implementation uses only the output of the final layer. ..
in conclusion
――It turned out that there are various ways to implement one method.
--Chainer-I like the Chemistry implementation the most. However, since I want to use PyTorch as the framework, I would like to implement it with PyTorch while referring to the source of Chainer-Chemistory.