[PYTHON] Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 2 [Model generation by machine learning]

Hello Licht. Following here, Deep Learning Tutorial Chapter 2 Describes the generation of a Deep Learning prediction model by machine learning. I will explain in a later chapter how neural networks specifically perform machine learning. Here, we will introduce practical usage.

Preparation

First, download the source code from this Github and put everything under hiraganaNN.py in the same directory as the image dataset. In Chapter 2, we will use hiraganaNN.py, dataArgs.py, hiragana_unicode.csv.

Run

After moving to the HIRAGANA_NN directory in the terminal (command prompt)

python hiraganaNN.py

Start with.

This will start machine learning by Deep Learning, but it will take time to learn, so I will explain a little while waiting. In the HIRAGANA_NN directory, images (110 * 110 pixels) divided into directories for each hiragana are stored. For example, in the 305e directory, images of hiragana "zo" are registered in various fonts and handwriting as follows. zo_directory.png

What is the purpose of machine learning in general, from various "zo" images? It is an attempt to learn that. In other words, these are "zo" (learning) zozozo.png

This is "Zo", isn't it? I want to do something (called identification / prediction / recognition). 305e_12382_reikofont.png

It sounds easy, but because the machine is simple, we make mistakes that humans wouldn't expect. For example, I learned "Zo" above. So the bottom is not "Zo"! Kippari rotate_zo.png

(Because it is tilted a little) Mistakes (deterioration of versatility) due to such excessive learning (other than the above "zo" are not recognized as "zo") are called "overfitting". In addition, deterioration of learning efficiency, which makes learning difficult in places that are not originally related to recognizing the color and density of characters, is also a cause of performance deterioration. colorful_zo.png

Performs "pre-processing" to avoid performance degradation such as overfitting and deterioration of learning efficiency. There are various pre-processing, but data expansion (rotation, movement, elastic distortion, noise, etc.) (whatever comes if you learn various "zo"!), various_zo.png

One example is data normalization (grayscale, whitening, batch normalization, etc.) that simplifies the problem and improves learning efficiency.

Source code overview

From the first line

unicode2number = {}
import csv
train_f = open('./hiragana_unicode.csv', 'rb')
train_reader = csv.reader(train_f)
train_row = train_reader
hiragana_unicode_list = []
counter = 0
for row in train_reader:
    for e in row:
        unicode2number[e] = counter
        counter = counter + 1

Here, each hiragana is numbered. The one with unicode 304a (o) is number 0, the one with 304b (ka) is number 1, and so on. next

files = os.listdir('./')
for file in files:
    if len(file) == 4:
        #Hiragana directory
        _unicode = file
        imgs = os.listdir('./' + _unicode + '/')
        counter = 0
        for img in imgs:
            if img.find('.png') > -1:
                if len(imgs) - counter != 1:
                ...

Here, the image as input data (learning data) is read. Load only the last image in each directory for testing. When reading

x_train.append(src)
y_train.append(unicode2number[_unicode])

Image data is stored in x_train (x_test), and the correct label (0-83) is stored in y_train (y_test). The following part is to enlarge the data for one input data image

for x in xrange(1, 10):
    dst = dargs.argumentation([2, 3])
    ret, dst = cv2.threshold(
        dst, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    x_train.append(dst)
    y_train.append(unicode2number[_unicode])

Here, it is randomly moved and rotated to expand it to 10 sheets. A few of dargs.argumentation ([2, 3]) are terribly difficult to understand, but in order 2: rotation (rotation with three-dimensional depth) and 3: movement, after performing rotation processing, move processing It is given. Of each directory

next

x_train = np.array(x_train).astype(np.float32).reshape(
    (len(x_train), 1, IMGSIZE, IMGSIZE)) / 255
y_train = np.array(y_train).astype(np.int32)

The grayscale image has a pixel value of 0-255, but it is normalized to a value of 0-1 by dividing this by 255. This process improves learning efficiency.

The above is the preparation for training data (x_train, y_train), but the preparation for test (x_test, y_test) is also done in the same way. Here, we will read the last one in each directory for testing and verify the accuracy of the machine-learning model. The image for the test is also enlarged and read, but the reason for this is written around Chapter 7.

Since the following is the specific structure of Deep Learning, I will summarize the explanation in a later chapter.

Learning results

While doing so, the progress of machine learning came out in the terminal.

('epoch', 1)
COMPUTING...
train mean loss=3.53368299341, accuracy=0.161205981514
test mean loss=1.92266467359, accuracy=0.506097565337
('epoch', 2)
COMPUTING...
train mean loss=1.66657279936, accuracy=0.518463188454
test mean loss=1.0855880198, accuracy=0.701219529277
.
.(A warning will appear, but you can ignore it for the time being.)
('epoch', 16)
COMPUTING...
train mean loss=0.198548029516, accuracy=0.932177149753
test mean loss=0.526535278777, accuracy=0.844512195849
.
.
('epoch', 23)
COMPUTING...
train mean loss=0.135178960405, accuracy=0.954375654268
test mean loss=0.686121761981, accuracy=0.814024389154

Think of loss as the error between the output predicted by Deep Learning and the correct answer. accuracy is the percentage of correct answers In machine learning, the goal is to reduce the loss of test data.

As the learning progresses, the loss of train and test goes down, but the loss of train goes down after epoch16. There is a tendency for overfitting to increase the loss of test. When this happens, learning ends because it is the limit. For the time being, test loss = 0.526 of epoch16 gives the best result in this model. (Efforts to improve this accuracy will be discussed in a later chapter)

Since the learning result of Deep Learning of each epoch is saved in the same directory as the source code, Save the file'model16', which has the best results. (You can delete other model files) models.png

In the next Chapter 3, we will make actual predictions using this model.

chapter title
Chapter 1 Building a Deep Learning environment based on chainer
Chapter 2 Creating a Deep Learning Predictive Model by Machine Learning
Chapter 3 Character recognition using a model
Chapter 4 Improvement of recognition accuracy by expanding data
Chapter 5 Introduction to neural networks and explanation of source code
Chapter 6 Improvement of learning efficiency by selecting Optimizer
Chapter 7 TTA,Improvement of learning efficiency by Batch Normalization

Recommended Posts

Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 2 [Model generation by machine learning]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 3 [Character recognition using a model]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 1 [Environment construction]
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 4 [Improvement of recognition accuracy by expanding data]
Python learning memo for machine learning by Chainer Chapter 8 Introduction to Numpy
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
Python learning memo for machine learning by Chainer until the end of Chapter 2
GTUG Girls + PyLadiesTokyo Meetup I went to machine learning for the first time
Python learning memo for machine learning by Chainer from Chapter 2
Python learning memo for machine learning by Chainer Chapter 7 Regression analysis
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
Before the introduction to machine learning. ~ Technology required for machine learning other than machine learning ~
An introduction to OpenCV for machine learning
An introduction to Python for machine learning
Python learning memo for machine learning by Chainer Chapter 13 Neural network training ~ Chainer completed
[Introduction to machine learning] Until you run the sample code with chainer
Take the free "Introduction to Python for Machine Learning" online until 4/27 application
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
About the shortest path to create an image recognition model by machine learning and implement an Android application
Introduction to Deep Learning (1) --Chainer is explained in an easy-to-understand manner for beginners-
An introduction to machine learning for bot developers
How to use MkDocs for the first time
Looking back on the machine learning competition that I worked on for the first time
I tried to predict the change in snowfall for 2 years by machine learning
Implementation of Deep Learning model for image recognition
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
[For beginners] Introduction to vectorization in machine learning
Try posting to Qiita for the first time
The first step of machine learning ~ For those who want to implement with python ~
Python learning memo for machine learning by Chainer Chapters 1 and 2
Introduction to machine learning
Try to evaluate the performance of machine learning / regression model
Try to evaluate the performance of machine learning / classification model
If you're learning Linux for the first time, do this!
Kaggle for the first time (kaggle ①)
Japanese preprocessing for machine learning
Chapter 7 [Neural Network Deep Learning] P252 ~ 275 (first half) [Learn by moving with Python! New machine learning textbook]
An introduction to machine learning
Kaguru for the first time
Introduction to Deep Learning ~ Learning Rules ~
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Super introduction to machine learning
Introduction to Deep Learning ~ Backpropagation ~
Differences C # engineers felt when learning python for the first time
Machine Learning with docker (42) Programming PyTorch for Deep Learning By Ian Pointer
[Machine learning] I will explain while trying the deep learning framework Chainer.
How to use machine learning for work? 01_ Understand the purpose of machine learning
Introduction to Deep Learning (2) --Try your own nonlinear regression with Chainer-
Machine learning model management to avoid quarreling with the business side
Summary of pages useful for studying the deep learning framework Chainer
Disclose the know-how that created a similar image search service for AV actresses by deep learning by chainer
"Introduction to Machine Learning by Bayesian Inference" Approximate inference of Poisson mixed model implemented only with Python numpy
Introduction to machine learning Note writing
Introduction to Deep Learning ~ Coding Preparation ~
[For self-learning] Go2 for the first time
Introduction to Machine Learning Library SHOGUN
See python for the first time
Deep learning image recognition 2 model implementation
Start Django for the first time
Introduction to Deep Learning ~ Dropout Edition ~