Hello Licht. I got the ** Japanese character recognition data set ** sold at Environmental Research Institute, so for deep learning beginners using the data set I will publish the tutorial. We will try to develop a Japanese character recognition engine.
As you can see from the image below, it is a tutorial that guarantees the collapse of Gestaltzerfall, but I would like to do my best without fail.
This article is ・ I want to start Deep Learning! ・ I want to do tutorials other than mnist number recognition! ・ I want to learn about Deep Learning related technologies! ・ I want to develop Japanese OCR by myself!
I am writing for those who say. This is explained in the outline below.
chapter | title |
---|---|
Chapter 1 | Building a Deep Learning environment based on chainer |
Chapter 2 | Creating a Deep Learning Predictive Model by Machine Learning |
Chapter 3 | Character recognition using a model |
Chapter 4 | Improvement of recognition accuracy by expanding data |
Chapter 5 | Introduction to neural networks and explanation of source code |
Chapter 6 | Improvement of learning efficiency by selecting Optimizer |
Chapter 7 | TTA,Improvement of learning efficiency by Batch Normalization |
If you are completely new to Deep Learning, please try up to Chapter 4 because you want to see moving objects anyway. Chapters 5 and below are for those who want to know more about Deep Learning.
** chainer is domestic OSS **. Best of all, it's easy to use and understand, and even if you ask a question about chainer on Google Group, it will respond immediately for free.
The main part is based on the assumption that it is a Mac, but I will explain each of them according to Windows at any time (although the only difference is the environment preparation). ・ Machine spec: Memory 4GB or more -Python2.7 series, pip must be installed
At the terminal
sudo pip install chainer
Enter chainer1.6.0, filelock2.0.5, nose1.3.7, numpy1.10.4, protobuf 2.6.1 in bulk.
sudo pip install scipy
To install scipy 0.17.0.
Also, please install Opencv 2.4.X series by referring to this article.
At the command prompt
pip install chainer
Enter chainer1.6.0, filelock2.0.5, nose1.3.7, numpy1.10.4, protobuf 2.6.1 in bulk.
pip install scipy
To install scipy 0.17.0. Start the command prompt in administrator mode if necessary. Also, please install Opencv 2.4.X series by referring to this article.
Purchase (1000 yen) the Hiragana dataset from the Environmental Research Institute website and download it. Create a directory called "HIRAGANA_NN" on your desktop and unzip it.
-DESKTOP -HIRAGANA_NN -304a -304b ・ ・ (Reference) It is OK if it looks like the image below.
In addition, directories such as 304a show Unicode of each hiragana, and the contents are as follows.
You are now ready. I would like to move on to machine learning from the next chapter 2!
chapter | title |
---|---|
Chapter 1 | Building a Deep Learning environment based on chainer |
Chapter 2 | Creating a Deep Learning Predictive Model by Machine Learning |
Chapter 3 | Character recognition using a model |
Chapter 4 | Improvement of recognition accuracy by expanding data |
Chapter 5 | Introduction to neural networks and explanation of source code |
Chapter 6 | Improvement of learning efficiency by selecting Optimizer |
Chapter 7 | TTA,Improvement of learning efficiency by Batch Normalization |
Recommended Posts