[PYTHON] Learn to recognize handwritten numbers (MNIST) with Caffe

What is Caffe

It is an open source framework of Deep Learning, which has been a hot topic recently. The official homepage can be used with here, C ++, Python, and MATLAB, so you can choose the one you are good at. You can see a demo of image classification from here, so if you are interested, please try it.

The introduction method of Caffe is summarized in this article, so please refer to it if you like.

What is MNIST (Mixed National Institute of Standards and Technology)?

MNIST is a handwritten digit image database consisting of 60,000 learning samples and 10000 test samples of 28x28px. It is widely used in the fields of machine learning and deep learning as a benchmark for neural networks.

Caffe has a script that makes learning MNIST easy, so I tried it!

Learn MNIST with Caffe

First, change to the root directory of Caffe. After moving, execute the script prepared in Caffe in advance to download the MNIST data set, but wget and gunzip are required for this execution, so if you do not have it, please put it in homebrew.

cd /path/to/caffe
./data/mnist/get_mnist.sh

Since Caffe needs to be read in LevelDB or LMDB format when performing image learning, the data downloaded above is converted to LMDB with the following script.

./examples/mnist/create_mnist.sh

Then you should have created ./example/mnist/mnist_train_lmdb and ./example/mnist/mnist_test_lmdb. The train is used for learning and the test is used for learning evaluation.

If you are running Caffe on CPU_ONLY before training, rewrite solver_mode: GPU in ./examples/mnist/lenet_solver.prototxt to solver_mode: CPU. If you don't do this, you'll get a bug.

Finally, we will learn based on the prepared data, which also has a script, so just execute it.

./examples/mnist/train_lenet.sh

Then, lenet_train_test.prototxt and lent_solver.prototxt will be read and layer network construction and learning will start.

You should see loss and accuracy during learning. loss is a value that decreases as the number of matches between the prediction and the correct answer increases, and accuracy means the accuracy rate. So, if you look at the transition of these values, you can see how learning is progressing.

I0704 solver.cpp:343]     Test net output #0: accuracy = 0.0912
I0704 solver.cpp:343]     Test net output #1: loss = 2.44682 (* 1 = 2.44682 loss)

At the end of the training, the test runs and the final evaluation of the network is displayed, and .examples/mnist/lenet_iter_10000.caffemodel is output as the trained model. By reading this trained model, the network in the state after MNIST recognition learning can be reproduced at any time, so the trained network can be incorporated into actual products.

This is the end of MNIST recognition learning. This time, I read the originally prepared prototxt and learned it, so I don't think it makes much sense to tune the parameters, but repeat the test while changing the value of the contents of lenet_solver.prototxt and compare each data. It's fun, so please try it.

If you have any idea how to use the model output by learning and how to display the learning progress in a graph, I will summarize it.

If you have any mistakes or opinions, please let us know.