[PYTHON] About data expansion processing for deep learning

Introduction

This time, I created a program that performs image data expansion processing in a batch using python (referring to the one handled by ImageDataGenerator of keras). We will also touch on a brief description and notes on the data expansion process. The source code is here.

What is data expansion processing?

A large amount of high-quality learning data is required to build an image recognition / classification model by deep learning. However, in most cases, you will face a situation where the amount of training data is small. If the learning data is small, it can be increased by brute force (such as shooting an object to be learned), but it takes a lot of time and effort. Data expansion processing is a technique that is useful when "there is little learning data". Specifically, we will expand the number of data by applying inversion, enlargement / reduction processing, etc. to the original training data.

Data expansion processing handled this time

There are various types of data expansion processing at the moment, but this time we will deal with the following processing. In order to display the result of applying the extended processing, from "Commercial free photo search", a cat image ([One day wormwood by sabamiso](http://www.igosso.net/se.cgi?q=%E3%81%82%E3%82%8B%E6%97%A5%E3%81%AE%E3%82%88%E3% 82% 82% E3% 81% 8E & sa =% E6% A4% 9C% E7% B4% A2 & lid = 1 & lia = 1 & lib = 1 & lic = 1)) was used.

--Rotation It is a process to rotate to an arbitrary angle. I think this process assumes that the angles of the cats being shot are different. cat.jpgrot.jpg

--Translation in horizontal and vertical directions This is a process to translate the subject of an image in the horizontal and vertical directions. I think this process is supposed to be when the cat being photographed is on the left or above. cat.jpgshift.jpgheight.jpg

--Enlarge / Reduce This is the process of enlarging or reducing the subject of an image. I think this process is intended for shooting cats that are near or far away. cat.jpgscale.jpg

--Color tone change It is a process to brighten or darken the whole. I think this process assumes that the shooting environment is bright or dark. cat.jpgtest.jpg

How to operate the data expansion processing program

By executing "main.py", the processing listed in "Data expansion processing handled this time" is combined and applied to create a learning image. You can use it as you like by setting variables in the following source code.

main.Part of py


# -------------------Below, each variable specified individually-------------------

input_dir = 'trainImg' #Folder name containing the original image of learning
output_dir = "output" #Output folder name after expansion processing
num = 10 #Number of images to expand

generator = ImageDataGenerator(
                rotation_range=90, #Set the rotation angle to 90 °
                width_shift_range=0.1, #Randomly shift horizontally
                height_shift_range=0.1, #Randomly shift vertically
                zoom_range=0.3, #Range to scale
                channel_shift_range=50.0, #Add a random value to the pixel value
                horizontal_flip=False, #Randomly flipped vertically
                vertical_flip=True #Randomly flipped horizontally
                )
# -------------------As mentioned above, each variable specified individually-------------------

If "input_dir" contains 5 learning original images and "main.py" is executed with the above variables, 5x10 images will be generated in "output_dir".

Precautions for data expansion processing

Data expansion processing does not have to be applied. For example, when considering creating a character recognition model for hiragana, a character image (see: https://lab.ndl.go.jp/cms/hiragana73 When data expansion processing is applied to jp / cms / hiragana73)), the following problems occur.

--When rotation processing is applied to "i" in hiragana (left: original image, right: processed image)     1930_1175565_0024.png      test1.png As shown in the above result, the processed image will have characters similar to "ko", so it may be misrecognized.

--When the left-right reversal processing is applied to the hiragana "U" (left: original image, right: processed image)     1930_1176246_0041.png      test.png As shown in the above result, the processed image is a non-existent character, which causes a decrease in the accuracy of the character recognition model (meaningless learning).

Summary

In this article, I gave a brief explanation of creating a data expansion processing program using python and data expansion processing. Data expansion processing can be done easily, and if it goes well, it will lead to improvement in accuracy. However, depending on the target of the training image, applying the data expansion process may create a meaningless training image. Therefore, I think it is necessary to consider which transformation (rotation, etc.) should be used and how much transformation (rotation, how much angle) should be performed according to the target of the training image.

Recommended Posts

About data expansion processing for deep learning
[Translation] scikit-learn 0.18 tutorial Statistical learning tutorial for scientific data processing
Data set for machine learning
About Deep Learning (DNN) Project Management
[AI] Deep Learning for Image Denoising
Deep Learning
Make your own PC for deep learning
[Deep Learning from scratch] About hyperparameter optimization
Quickly build a python environment for deep learning and data science (Windows)
[Translation] scikit-learn 0.18 Tutorial Statistical learning tutorial for scientific data processing Put all together
Recommended study order for machine learning / deep learning beginners
Creating learning data for face image dataset sorting (# 1)
[Translation] scikit-learn 0.18 Tutorial Search for help on statistical learning tutorials for scientific data processing
Deep Learning Memorandum
Start Deep learning
Read & implement Deep Residual Learning for Image Recognition
Python: Deep Learning in Natural Language Processing: Basics
Python Deep Learning
Deep learning × Python
Implementation of Deep Learning model for image recognition
Stock price forecast using deep learning [Data acquisition]
A story about data analysis by machine learning
About list processing (Python beginners after learning Ruby)
I installed Chainer, a framework for deep learning
[Translation] scikit-learn 0.18 Tutorial Statistical learning tutorial for scientific data processing Unsupervised learning: Finding the representation of data
[Translation] scikit-learn 0.18 tutorial Statistical learning tutorial for scientific data processing Statistical learning: Settings and estimator objects in scikit-learn
Thinking about party attack-like growth tactics using deep learning
Organizing basic procedures for data analysis and statistical processing (4)
A story about predicting exchange rates with Deep Learning
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
About data preprocessing of systems that use machine learning
Organizing basic procedures for data analysis and statistical processing (2)
Techniques for understanding the basis of deep learning decisions
Deep Learning 2 Made from Zero Natural Language Processing 1.3 Summary
xgboost: A valid machine learning model for table data
Data processing methods for mechanical engineers and non-computer engineers (Introduction 2)
A scene where GPU is useful for deep learning?
Data processing methods for mechanical engineers and non-computer engineers (Introduction 1)
About Python for loops
[Translation] scikit-learn 0.18 Tutorial Statistical learning tutorial for scientific data processing Model selection: Estimator and its parameter selection
About machine learning overfitting
Python: Deep Learning Practices
Deep learning / activation functions
Deep Learning from scratch
Deep learning 1 Practice of deep learning
Reinforcement learning for tic-tac-toe
Deep learning / cross entropy
First Deep Learning ~ Preparation ~
About Python, for ~ (range)
First Deep Learning ~ Solution ~
[AI] Deep Metric Learning
I tried deep learning
Python: Deep Learning Tuning
Deep learning large-scale technology
About polymorphism for nesting
Summary for learning RAPIDS
Deep learning / softmax function
Tips for handling variable length inputs in deep learning frameworks
Japanese translation of public teaching materials for Deep learning nanodegree
Performance verification of data preprocessing for machine learning (numerical data) (Part 2)
Create an environment for "Deep Learning from scratch" with Docker