[PYTHON] Explaining mnist after chainer 1.11.0

Introduction

It has changed quite a bit since chainer became 1.11.0, so I will write my own understanding. As much as possible, I'll make it understandable to those who are new to python and chainer.

The code is here It is a file called train_mnist.py in the sample.

MNIST mnist is a dataset of images with numbers of 28x28 size. It is often used as an introduction to machine learning.

network

class MLP(chainer.Chain):
    def __init__(self, n_in, n_units, n_out):
        super(MLP, self).__init__(
            l1=L.Linear(n_in, n_units),
            l2=L.Linear(n_units, n_units), 
            l3=L.Linear(n_units, n_out), 
        )
 
    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)

In the network definition, \ _ \ _ init \ _ \ _ defines the layer to be used. This time,

\ _ \ _ Call \ _ \ _ describes a specific network. This time, the activation function relu is used for the output of l1 and l2.

Parser For the time being, parser

    parser = argparse.ArgumentParser(description='Chainer example: MNIST')
    parser.add_argument('--batchsize', '-b', type=int, default=100,
                help='Number of images in each mini-batch')
    parser.add_argument('--epoch', '-e', type=int, default=20,
                help='Number of sweeps over the dataset to train')
    parser.add_argument('--gpu', '-g', type=int, default=-1,
                help='GPU ID (negative value indicates CPU)')
    parser.add_argument('--out', '-o', default='result',
                help='Directory to output the result')
    parser.add_argument('--resume', '-r', default='',
                help='Resume the training from snapshot')
    parser.add_argument('--unit', '-u', type=int, default=1000,
                help='Number of units')
    args = parser.parse_args()
 
    print('GPU: {}'.format(args.gpu))
    print('# unit: {}'.format(args.unit))
    print('# Minibatch-size: {}'.format(args.batchsize))
    print('# epoch: {}'.format(args.epoch))
    print('')

parser is a handy guy that makes it easier to set parameters when running python with commands. For example, if you execute the following in the terminal

> $ python train_mnist.py -g 0 -u 100
GPU: 0
# unit: 100
# Minibatch-size: 100
# epoch: 20

It will be displayed. If epoch is not specified, it will be the value initialized by default. If you want to add it yourself

add_argument('Name to call later', '-How to specify in the terminal',If it is a number, type=int,Default value if not specified)

It can be used in the form of.

Data initialization

In chainer, train data and test data are prepared.

train, test = chainer.datasets.get_mnist()

It just fetches the data used by mnist and puts it in train and test. What kind of shape is inside is in one line (train [0]) [[.234809284, .324039284, .34809382 …. .04843098], 3] And so, the input value on the left and the answer (label value) on the right are included as a set. Also, with chainer, you will learn with train, try it with test, and see the correct answer rate.

Iterator

In the past, I used to prepare for myself and turn it many times to learn, but from 1.11.0 I said that I will use this by putting data like train above. You don't need to write for minutes.

train_iter = chainer.iterators.SerialIterator(train, args.batchsize)
test_iter = chainer.iterators.SerialIterator(test, args.batchsize,
                                                 repeat=False, shuffle=False)

It seems that this is all right. There is a feeling of magic

Trainer A trainer has been added, and it seems that it will do various things almost arbitrarily. Please give the tutor a collection of questions and answers to your child. Typical Image of leaving the tutor to teach studying by yourself (I don't know if it suits you)

First, set the trainer.

updater = training.StandardUpdater(train_iter, optimizer, device=args.gpu)
trainer = training.Trainer(updater, (args.epoch, 'epoch'),

Using this train_iter (problem collection), have this optimizer (study method) optimize it, Turn it _epoch _ (how many laps).

Some of the following are not always necessary:

trainer.extend(extensions.Evaluator(test_iter, model, device=args.gpu))
    #This is. test_I use iter to evaluate each epoch (I think)
trainer.extend(extensions.dump_graph('main/loss'))
    #Save the network shape in dot format so that it can be displayed as a graph.
trainer.extend(extensions.snapshot(), trigger=(args.epoch, 'epoch'))
    #Save trainer information for each epoch. You can read it and restart from the middle. Did this get much faster?
trainer.extend(extensions.LogReport())
    #Log for each epoch
trainer.extend(extensions.PrintReport(
        ['epoch', 'main/loss', 'validation/main/loss',
         'main/accuracy', 'validation/main/accuracy']))
    #Specify the information to be output in log.
trainer.extend(extensions.ProgressBar())
    #It will tell you the whole thing now and how far each epoch is going.
 
trainer.run()
    #After setting various trainers, do this and actually execute it. This is mandatory

main / loss is the size of the difference from the answer. mian / accuracy is the accuracy rate. I'm not sure what validation / main / accuracy refers to. (If anyone can comment ...)

I think I'll explain it here and not explain it over there, but I don't understand it yet.

I'm still planning to raise the details of how I actually played with it.

Recommended Posts

Explaining mnist after chainer 1.11.0
Inference works with Chainer 2.0 MNIST sample