[PYTHON] Introduction to Private Chainer

One of the popular words from 2015 to 2016 is "artificial intelligence", but it's not the case that you say "I can't use it because I don't know it well", so studying Chainer regardless of work. I'm going to start.

I used to use SVM or RandomForest in scenes where a discriminator is needed for business, but from now on, I can expect some questions that will be asked when writing a discriminator with SVM as usual (" What happens if I do deep learning? ”→“ I don't want to do anything ”), I want to emphasize my attitude of being sensitive to buzzwords (mystery).

References:

Install chainer

If you don't understand by reading http://docs.chainer.org/en/stable/install.html, the installation is so easy that you have to leave on the spot.

For the time being, CUDA is not an environment that can be used on my Mac at home, so I will try to install chainer without CUDA suuport.

What about support at anaconda?

Before that, the Python environment for my Mac at home is built with Anaconda, so first check if it can be installed with conda.

% anaconda search -t conda chainer
Using Anaconda API: https://api.anaconda.org
Run 'anaconda show <USER/PACKAGE>' to get more details:
Packages:
     Name                      |  Version | Package Types   | Platforms
     ------------------------- |   ------ | --------------- | ---------------
     steerapi/chainer          |        0 | conda           | win-64
                                          : A flexible framework of neural networks
Found 1 packages

It seems that only the win-64 package is prepared, so install chainer with pip according to Documentation.

% pip install chainer
(Omission)
Installing collected packages: chainer
Successfully installed chainer-1.19.0

At this point, I didn't get an error when I tried import chainer on ipython, so I think it's probably OK.

Conducting the tutorial

After installation, perform Tutorial. This is important for knowing "what can be done?" And also as a hint for "what keywords should I search for when reading a document to achieve what I want to do?"

So, what I've come to understand here is that you need to understand the jargon of neural networks, at least enough to read English documentation. If you can imagine the directed graph and weighting of the combination of ○ and → in Japanese, it is difficult to read this tutorial in the first place (that's me).

Well, but the keyword is "Define-by-Run".

Before conducting the tutorial

In the code introduced in the tutorial, the following is omitted.

import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions

I don't use CUDA in the actual home Mac environment, so I think it is necessary to deduct that.

MNIST The tutorial also introduces the implementation of MNIST. First, prepare the data.

train, test = datasets.get_mnist()

20170104_001.png

When this is executed, the handwritten character data of the example used in MNIST is downloaded as shown in the figure.

% ls -A ~/.chainer/dataset/pfnet/chainer/mnist/
test.npz   train.npz

The training dataset will be shuffled for each trial, but the test dataset will not need to be shuffled, so it says to set it as follows. That is, the training and testing datasets need to have different options for iterators.

train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)

Now that the dataset is ready, follow the tutorial to define a three-tier network structure.

class MLP(Chain):
    def __init__(self, n_units, n_out):
        super(MLP, self).__init__(
            l1 = L.Linear(None, n_units),
            l2 = L.Linear(None, n_units),
            l3 = L.Linear(None, n_out)
        )
    
    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        y = self.l3(h2)
        return y

It's a chainer because l1, l2, and l3 are called in a chain. Each of l1, l2, and l3 looks like a function that has an input and outputs it, but this is called a link in the chainer system, and the purpose of this system is to optimize this input. Is it a feeling of becoming?

So, in such a neural network of a three-layer network, the second layer is usually treated as a hidden layer, but especially in the definition of this network structure, it is not clearly declared that l2 is a hidden layer. However, when \ call is executed, h1 is calculated from the input x with l1, and that h1 is input to l2 without being output in particular, and the resulting h2 is also output without any particular output. Since it is passed to and the structure is such that only y of the calculation result is output, is it correct to understand that the link l2 points to the hidden layer as a result?

This means that when expanding class MLP from 3 layers to 4 layers and 5 layers, it seems good to just add an intermediate layer when \ __ init__.

The function that evaluates the accuracy and loss of this network is defined as `` `chainer.links.Classifier```, so we call it Sole.

model = L.Classifier(MLP(100, 10))
optimizer = optimizers.SGD()
optimizer.setup(model)

This sudden SGD () is [stochastic gradient descent](https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A % 84% E5% 8B% BE% E9% 85% 8D% E9% 99% 8D% E4% B8% 8B% E6% B3% 95).

At this point, it is finally possible to learn using the learning set.

updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')

Now, to learn, you can call run () of trainer, but I want to know the learning status (or rather, I want to see the python script I wrote so far working properly ) In that case, it seems that you should set the extension.

trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())

20170104_002.png (Omitted) 20170104_003.png

So, after 20 times of repeated learning, a classifier was created. For details, the result folder stores the running log in a text file called log (because I specified extensions.LogReport and specified out ='result' for Trainer).

Summary

Summary so far.

--A batch called Link that receives input and outputs it (function? Class?) --Link can be described as being called in a chain reaction --What is output after receiving the input to link1 becomes the input of link2. --The output from link2 becomes the input to link3 ――What was "Define-by-Run"? ――It's hard to understand unless you compare it with another neural network implementation in the past.

From the impression I've seen so far, I have to have a little more knowledge about neural networks, but there is no requirement for "complex writing that strongly depends on the library", but it is a very python-like writing and thinking. It turned out that it made me feel like it was cool. I see, it's a popular reason.

By the way, at http://qiita.com/fukuit/items/d69d8ca1ad558c4de014, I tried to determine the numbers with the k-th nearest neighbor attached to OpenCV, but how is the result compared to Sole? In other words, it seems that the accuracy is about 0.95 in 20 trials, so it may be said that the performance as a discriminator is better than KNN which was about 0.91.

Today's code

The tutorial is up to that point.

Recommended Posts

Introduction to Private Chainer
An introduction to private TensorFlow
Introduction to MQTT (Introduction)
Introduction to Scrapy (3)
Introduction to Supervisor
Introduction to Tkinter 1: Introduction
Introduction to PyQt
Introduction to Scrapy (2)
[Linux] Introduction to Linux
Introduction to Scrapy (4)
Introduction to discord.py (2)
Introduction to discord.py
Introduction to Lightning pytorch
Introduction to Web Scraping
Introduction to Nonparametric Bayes
Introduction to EV3 / MicroPython
Introduction to Python language
Introduction to TensorFlow-Image Recognition
Introduction to OpenCV (python)-(2)
Introduction to PyQt4 Part 1
Introduction to Dependency Injection
Introduction to machine learning
AOJ Introduction to Programming Topic # 1, Topic # 2, Topic # 3, Topic # 4
Introduction to electronic paper modules
A quick introduction to pytest-mock
Introduction to Monte Carlo Method
[Learning memorandum] Introduction to vim
Introduction to PyTorch (1) Automatic differentiation
opencv-python Introduction to image processing
Introduction to Python Django (2) Win
Introduction to Cython Writing [Notes]
Kubernetes Scheduler Introduction to Homebrew
An introduction to machine learning
[Introduction to cx_Oracle] Overview of cx_Oracle
A super introduction to Linux
AOJ Introduction to Programming Topic # 7, Topic # 8
[Introduction to pytorch-lightning] First Lit ♬
Migrating from Chainer v1 to Chainer v2
Introduction to RDB with sqlalchemy Ⅰ
[Introduction to Systre] Fibonacci Retracement ♬
Introduction to Nonlinear Optimization (I)
Introduction to serial communication [Python]
AOJ Introduction to Programming Topic # 5, Topic # 6
Introduction to Deep Learning ~ Learning Rules ~
[Introduction to Python] <list> [edit: 2020/02/22]
Introduction to Python (Python version APG4b)
An introduction to Python Programming
[Introduction to cx_Oracle] (8th) cx_Oracle 8.0 release
Introduction to discord.py (3) Using voice
An introduction to Bayesian optimization
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Super introduction to machine learning
Introduction to Ansible Part ③'Inventory'
Series: Introduction to cx_Oracle Contents
[Introduction] How to use open3d
Introduction to Python For, While
Introduction to Deep Learning ~ Backpropagation ~
Introduction to Ansible Part ④'Variable'
Introduction to vi command (memorandum)
Introduction to Linux Commands ~ LS-DYNA Edition ~
[Introduction to Udemy Python 3 + Application] 58. Lambda