[PYTHON] Introduction to Deep Learning (2) --Try your own nonlinear regression with Chainer-

Introduction

In the previous "Introduction to Deep Learning (1) -Understanding and using Chainer-", I summarized how to use Chainer. .. If you follow the reference, you will find out how to use it. When studying machine learning, I assume a simple regression problem and decide for myself that I can understand and use it when I can build a prediction model for it.

Therefore, this time, specifically, the data of the non-linear sin function is generated, and the non-linear regression is performed by the model constructed by Chainer. If you finish all the steps up to this point, you may be able to understand it even if you proceed to the advanced version such as image discrimination.

Development environment

・ OS: Mac OS X EL Capitan (10.11.5) · Python 2.7.12: Anaconda 4.1.1 (x86_64) ・ Chainer 1.12.0

This goal

As shown in the image below, we will build a nonlinear regression model that can capture the sin function perfectly.

ゴール.png

Build a nonlinear regression model

The big picture of the program

MyChain.py


# -*- coding: utf-8 -*-
from chainer import Chain
import chainer.links as L
import chainer.functions as F

class MyChain(Chain):

    def __init__(self):
        super(MyChain, self).__init__(
            l1 = L.Linear(1, 100),
            l2 = L.Linear(100, 30),
            l3 = L.Linear(30, 1)
        )

    def predict(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)

example.py


# -*- coding: utf-8 -*-

#Numerical calculation related
import math
import random
import numpy as np
import matplotlib.pyplot as plt
# chainer
from chainer import Chain, Variable
import chainer.functions as F
import chainer.links as L
from chainer import optimizers
from MyChain import MyChain

#Fixed random number seed
random.seed(1)

#Generation of sample data
#Sine function as a true function
x, y = [], []
for i in np.linspace(-3,3,100):
    x.append([i])
    y.append([math.sin(i)])  #True function
#Declare again as a chainer variable
x = Variable(np.array(x, dtype=np.float32))
y = Variable(np.array(y, dtype=np.float32))

#Declare NN model
model = MyChain()

#Loss function calculation
#Squared error in loss function(MSE)use
def forward(x, y, model):
    t = model.predict(x)
    loss = F.mean_squared_error(t, y)
    return loss

#chainer optimizer
#Adam is used for the optimization algorithm
optimizer = optimizers.Adam()
#Pass model parameters to optimizer
optimizer.setup(model)

#Repeat learning of parameters
for i in range(0,1000):
    loss = forward(x, y, model)
    print(loss.data)  #Show current MSE
    optimizer.update(forward, x, y, model)

#plot
t = model.predict(x)
plt.plot(x.data, y.data)
plt.scatter(x.data, t.data)
plt.grid(which='major',color='gray',linestyle='-')
plt.ylim(-1.5, 1.5)
plt.xlim(-4, 4)
plt.show()

Generation of sample data

Generate teacher data to build this nonlinear regression model. This time, we will use the sin function with 1 input and 1 output.

#Generation of sample data
#Sine function as a true function
x, y = [], []
for i in np.linspace(-3,3,100):
    x.append([i])
    y.append([math.sin(i)])  #True function
#Declare again as a chainer variable
x = Variable(np.array(x, dtype=np.float32))
y = Variable(np.array(y, dtype=np.float32))

Model definition

Build a model of Deep Learning with Chainer. This time, we made a four-layer structure consisting of an input layer, a hidden layer 1, a hidden layer 2, and an output layer. I decided the number of nodes appropriately (I wrote previous around here, but it is experience and intuition). If you are interested, you can edit the value here. The reason why there are two hidden layers is that when I went back to one layer, the characteristics were not captured well, so I increased it even more.

MyChain.py


# -*- coding: utf-8 -*-
from chainer import Chain
import chainer.links as L
import chainer.functions as F

class MyChain(Chain):

    def __init__(self):
        super(MyChain, self).__init__(
            l1 = L.Linear(1, 100),
            l2 = L.Linear(100, 30),
            l3 = L.Linear(30, 1)
        )

    def predict(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)

The point is that you are using relu for the activation function. Not long ago, it was standard to use the sigmoid function for this activation function, but recently, when training parameters by the error back propagation method, the learning rate decreases as it goes to the back. It seems that relu is often used to avoid. I know this area only by feeling, so I need to study a little more. There are various other commentary articles on the activation function, so please check them out. Reference: [Machine learning] I will explain while trying the deep learning framework Chainer.

Confirm in unlearned situation

I tried to see what happened in a situation where I didn't learn at all. I think that you will be able to grasp the overall feeling by looking not only at the final result but also at the progress.

##Declare NN model
model = MyChain()

#plot
t = model.predict(x)
plt.plot(x.data, y.data)
plt.scatter(x.data, t.data)
plt.grid(which='major',color='gray',linestyle='-')
plt.ylim(-1.5, 1.5)
plt.xlim(-4, 4)
plt.show()
スクリーンショット 2016-08-08 15.49.58.png

In the unlearned situation, you can see that the characteristics of the true function are not captured at all.

Learn parameters

To train the parameters, first define the loss function. This time we will use the mean squared error (MSE) as the loss function.

{\rm MSE} = \dfrac{1}{N} \sum_{n=1}^{N} \left( \hat{y}_{n} - y_{n} \right)^{2}
#Loss function calculation
#Squared error in loss function(MSE)use
def forward(x, y, model):
    t = model.predict(x)
    loss = F.mean_squared_error(t, y)
    return loss

By defining this loss function, Chainer can automatically calculate the gradient of the optimizer.

#chainer optimizer
#Adam is used for the optimization algorithm
optimizer = optimizers.Adam()
#Pass model parameters to optimizer
optimizer.setup(model)
#Gradient update
optimizer.update(forward, x, y, model)

This is the end of the basic flow.

Repeat learning

By repeating the above ```optimizer.update () several times, it will converge to a good parameter. This time, the teacher data is trained many times with the same data, but originally, as batch data, some samples are taken out from the sample population, they are trained as teacher data, and another sample is batched in the next cycle. The flow is to use it as data.

#Repeat learning of parameters
for i in range(0,1000):
    loss = forward(x, y, model)
    print(loss.data)  #Show current MSE
    optimizer.update(forward, x, y, model)

スクリーンショット 2016-08-08 16.16.36.png

You can see that the square error becomes smaller as the learning is repeated. After learning, we were able to approximate the function very smoothly.

ゴール.png

reference

    1. Official Chainer Reference There were various things written in Japanese, but I often encountered parts that could not be dealt with due to version changes, etc., so this was the most stable in English.
  1. Introduction to Deep Learning (1) --Understanding and using Chainer-

bonus

We are waiting for you to follow us! Qiita: Carat Yoshizaki twitter:@carat_yoshizaki Hatena Blog: Carat COO Blog Home page: Carat

Tutor service "Kikagaku" where you can learn machine learning one-on-one Please feel free to contact us if you are interested in "Kikagaku" where you can learn "Mathematics-> Programming-> Web Applications" at once.

Recommended Posts

Introduction to Deep Learning (2) --Try your own nonlinear regression with Chainer-
[Introduction to StyleGAN] Unique learning of anime with your own machine ♬
Try to make your own AWS-SDK with bash
Try deep learning with TensorFlow
Try Deep Learning with FPGA
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Introduction to Deep Learning ~ Backpropagation ~
Try to build a deep learning / neural network with scratch
[Evangelion] Try to automatically generate Asuka-like lines with Deep Learning
Try HeloWorld in your own language (with How to & code)
Try Deep Learning with FPGA-Select Cucumbers
Introduction to Deep Learning ~ Function Approximation ~
Try deep learning with TensorFlow Part 2
Introduction to Deep Learning ~ Coding Preparation ~
Try Common Representation Learning with chainer
Introduction to Deep Learning ~ Dropout Edition ~
Introduction to Deep Learning ~ Forward Propagation ~
Introduction to Deep Learning ~ CNN Experiment ~
[Reinforcement learning] DQN with your own library
Classify anime faces with deep learning with Chainer
Introduction to Deep Learning ~ Convolution and Pooling ~
Try Bitcoin Price Forecasting with Deep Learning
Make your own PC for deep learning
Try deep learning of genomics with Kipoi
To import your own module with jupyter
[Introduction to machine learning] Until you run the sample code with chainer
Put your own image data in Deep Learning and play with it
[Python] Easy introduction to machine learning with python (SVM)
[Introduction to StyleGAN2] Independent learning with 10 anime faces ♬
Extend and inflate your own Deep Learning dataset
Introduction to Deep Learning ~ Localization and Loss Function ~
Steps to install your own library with pip
Introduction to Deep Learning (1) --Chainer is explained in an easy-to-understand manner for beginners-
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
Memo to create your own Box with Pepper's Python
Try to improve your own intro quiz in Python
Try to predict forex (FX) with non-deep machine learning
Try to put LED in your own PC (slightly)
Introduction to Private Chainer
Try regression with TensorFlow
Introduction to machine learning
Now, let's try face recognition with Chainer (learning phase)
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 1 [Environment construction]
I tried to make Othello AI that I learned 7.2 million hands by deep learning with Chainer
Try to make a blackjack strategy by reinforcement learning (③ Reinforcement learning in your own OpenAI Gym environment)
Try to evaluate the performance of machine learning / regression model
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Try sorting your own objects with priority queue in Python
Recognize your boss and hide the screen with Deep Learning
I tried to implement ListNet of rank learning with Chainer
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
Try to predict if tweets will burn with machine learning
I captured the Touhou Project with Deep Learning ... I wanted to.
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
Try Q-learning in Dragon Quest-style battle [Introduction to Reinforcement Learning]
Reinforcement learning 23 Create and use your own module with Colaboratory
I tried to divide with a deep learning language model
Try to implement linear regression using Pytorch with Google Colaboratory
Try to factorial with recursion
[Learning memorandum] Introduction to vim
An introduction to machine learning