I am studying deep learning by running Chainer's example code.
At the time of writing this article (June 2017), the latest version of Chainer is 2.0, but it may not be compatible with 1.x and older versions of the code may not work. Reference: Differences between chainer versions (as of January 19, 2016)
This article is a Chainer 2.0 MNIST sample and is an implementation note for working inference.
For the implementation, I referred to this article. Chainer: Tutorial for Beginners Vol.1
Chainer 2.0 python 2.7.10 Run on CPU
https://github.com/abechi/chainer_mnist_predict
Chainer 2.0 MNIST sample (original) https://github.com/chainer/chainer/tree/v2.0.0/examples/mnist
train_mnist.py
# Run the training
trainer.run()
chainer.serializers.save_npz('my_mnist.model', model) # Added
$ python train_mnist.py --epoch 3
GPU: -1
# unit: 1000
# Minibatch-size: 100
# epoch: 3
epoch main/loss validation/main/loss main/accuracy validation/main/accuracy elapsed_time
1 0.191836 0.0885223 0.942233 0.9718 26.099
2 0.0726428 0.0825069 0.9768 0.974 53.4849
3 0.0466335 0.0751425 0.984983 0.9747 81.2683
$ ls
my_mnist.model result/ train_mnist.py*
predict_mnist.py
#!/usr/bin/env python
from __future__ import print_function
try:
import matplotlib
matplotlib.use('Agg')
except ImportError:
pass
import argparse
import chainer
import chainer.functions as F
import chainer.links as L
from chainer import training
from chainer.training import extensions
# Network definition
class MLP(chainer.Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
# the size of the inputs to each layer will be inferred
self.l1 = L.Linear(None, n_units) # n_in -> n_units
self.l2 = L.Linear(None, n_units) # n_units -> n_units
self.l3 = L.Linear(None, n_out) # n_units -> n_out
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
return self.l3(h2)
def main():
parser = argparse.ArgumentParser(description='Chainer example: MNIST')
parser.add_argument('--unit', '-u', type=int, default=1000,
help='Number of units')
args = parser.parse_args()
print('# unit: {}'.format(args.unit))
print('')
# Set up a neural network
model = L.Classifier(MLP(args.unit, 10))
# Load the MNIST dataset
train, test = chainer.datasets.get_mnist()
chainer.serializers.load_npz('my_mnist.model', model)
x, t = test[0]
print('label:', t)
x = x[None, ...]
y = model.predictor(x)
y = y.data
print('predicted_label:', y.argmax(axis=1)[0])
if __name__ == '__main__':
main()
predict_mnist.py reads my_mnist.model to infer labels for test data.
$ python predict_mnist.py
# unit: 1000
label: 7
predicted_label: 7
I got the same label as the correct label.
train_mnist.py
# iteration, which will be used by the PrintReport extension below.
model = L.Classifier(MLP(args.unit, 10))
In train_mnist.py, I made a model using L.Classifier. You need to use L.Classifier as well when creating a model object during inference.
If you create an object for your model without going through L.Classifier, you will get an error when you load the model.
predict_mnist.py
# Set up a neural network
model = MLP(args.unit, 10)
error
KeyError: 'l2/b is not a file in the archive'
Reference Save and load Chainer model
Recommended Posts