Python: Application of image recognition using CNN

Data bulk

ImageDataGenerator

Image recognition requires a large number of combinations of image data and its labels (teacher data).

However, it is not possible to have a sufficient number of image and label combinations. Often it costs a lot.

Therefore, as a technique performed when increasing the number of data to a sufficient amount

There is an inflated image.

Inflating an image doesn't make sense just by copying the data and increasing the amount. So, for example, you can flip or shift the image to create new data.

image.png

Here, we will use Keras' ImageDataGenerator for padding.

ImageDataGenerator has many arguments, and by specifying them appropriately You can easily process the data.

You can also combine multiple processes to generate a new image. Let's take a look at some commonly used arguments in ImageDataGenerator.

datagen = ImageDataGenerator(rotation_range=0.,
                            width_shift_range=0.,
                            height_shift_range=0.,
                            shear_range=0.,
                            zoom_range=0.,
                            channel_shift_range=0,
                            horizontal_flip=False,
                            vertical_flip=False)
rotation_range     :Rotation range that rotates randomly (unit: degree)
width_shift_range  :Randomly translates horizontally, as a percentage of the width of the image
height_shift_range :Randomly translates vertically, as a percentage of the vertical width of the image
shear_range        :Degree of shear. Increasing the size makes the image look more diagonally crushed or stretched (unit: degree).
zoom_range         :The rate at which the image is randomly compressed and enlarged. Minimum 1-Compressed to zoomrange, up to 1+zoom_Expanded to range.
channel_shift_range:If the input is an RGB3 channel image, R,G,Add or subtract a random value for each B.(0~255)
horizontal_flip    :If True is specified, it will be flipped horizontally at random.
vertical_flip      :If True is specified, it will be flipped vertically at random.
flow
flow(x, y=None, batch_size=32, shuffle=True, seed=None, save_to_dir=None, save_prefix='', save_format='png', subset=None)
#Receives an array of numpy data and labels and extends/Generates a batch of normalized data.
argument
x:data,Must be 4D data. Set channel 1 for grayscale data and channel 3 for RGB data.
y:label
batch_size:Integer (default:32). Specifies the size of the batch of data.
shuffle  :Truth value (default):True). Whether to shuffle the data.
save_to_dir:None or string (default: None).. You can specify a directory to save the generated extended image (useful for visualizing what you have done)
save_prefix:String (default'').. Prefix (set) to be given to the file name when saving the image_to_Valid only when an argument is given to dir)
save_format: "png"Or"jpeg"(set_to_Valid only when an argument is given to dir).. The default is"png"
Return value
x is the Numpy array of image data and y is the corresponding Numpy array of labels.(x, y)Iterator generated from

If you are interested, there are some other arguments and you can perform various processing. Please refer to the Keras official website.

Normalization

Various normalization methods

The image below is an example of normalization. Normalization is the process of processing according to the rules in the data to make it easier to use.

In the above example, the way the light hits is unified by performing normalization. It removes differences between data that are not directly related to training. This can greatly improve the efficiency of learning.

image.png

The graph below is called "Batch Normalization (BN)" in the classification of cifar10. It shows that the correct answer rate increased significantly when normalization was performed.

In recent years, normalization may not be needed much in deep neural network models, There is no doubt that it will be extremely useful when using a simple model.

There are various normalization methods used for deep learning, and the typical ones are

Batch normalization(BN)
Principal component analysis (PCA)
Singular value decomposition (SVD)
Zero phase component analysis (ZCA)
Local response normalization (LRN)
Global Contrast Normalization (GCN)
Local Contrast Normalization (LCN)

These normalization methods can be broadly divided into "standardization" and "whitening".
We'll look at each one from the next session.

Standardization

Standardization is a technique that brings the distribution of data for each feature closer by making the individual features mean 0 and variance 1.

The image below is in the CIFAR-10 dataset About each feature (here, 3 channels of R, G, B) It is standardized. (A little more processing has been added to make it easier to see)

By standardizing, the shades are averaged and it looks grayish. On the contrary, the color (R or G or B) that was not noticeable until then Because it will be emphasized (weighted) at the same level as other colors It makes it easier to find hidden features.

image.png

Click here for an implementation example

import matplotlib.pyplot as plt
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(X_train[i])
plt.suptitle('base images', fontsize=12)
plt.show()

#Generator generation
datagen = ImageDataGenerator(samplewise_center=True, samplewise_std_normalization=True)

#Standardization
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()

#Makes the generated image easier to see
X_batch *= 127.0 / max(abs(X_batch.min()), X_batch.max())
X_batch += 127.0
X_batch = X_batch.astype('uint8')

for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(X_batch[i])
plt.suptitle('standardization results', fontsize=12)
plt.show()

image.png

Whitening

Whitening is the process of eliminating the correlation between data features.

The image below is in the CIFAR-10 dataset About each feature (here, 3 channels of R, G, B) It is whitened. (A little more processing has been added to make it easier to see)

By whitening, it looks darker overall and the edges are emphasized. This is because whitening has the effect of ignoring the shades that are easily expected from the information of the surrounding pixels.

Due to whitening, it is not a surface or background with a small amount of information Learning efficiency can be improved by emphasizing edges with a large amount of information.

#Generator generation
datagen = ImageDataGenerator(featurewise_center=True, zca_whitening=True)

#Whitening
datagen.fit(X_train)
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()

image.png

import matplotlib.pyplot as plt
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#This time, we will use 300 of all data for training and 100 for testing.
X_train = X_train[:300]
X_test = X_test[:100]
y_train = y_train[:300]
y_test = y_test[:100]

for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(X_train[i])
plt.suptitle('base images', fontsize=12)
plt.show()

#Generator generation
datagen = ImageDataGenerator(featurewise_center=True,zca_whitening=True)

#Whitening
datagen.fit(X_train)
g = datagen.flow(X_train, y_train, shuffle=False)
X_batch, y_batch = g.next()

#Makes the generated image easier to see
X_batch *= 127.0 / max(abs(X_batch.min()), abs(X_batch.max()))
X_batch += 127
X_batch = X_batch.astype('uint8')

for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(X_batch[i])
plt.suptitle('whitening results', fontsize=12)
plt.show()

image.png

Batch normalization

In deep learning, standardization should be performed for each batch during mini-batch learning. It is called "batch normalization".

In Keras, as follows, with fully connected layers, convolution layers, activation functions, etc. Similarly, it can be incorporated into the model with the add method of model.

model.add(BatchNormalization())

Batch normalization can be applied to the output of the middle tier, not just as data preprocessing. In particular, for the output of a function whose output value range is not limited, such as the activation function ReLU. Batch normalization can be very effective because it makes learning easier.

import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.layers import Activation, Conv2D, Dense, Flatten, MaxPooling2D, BatchNormalization
from keras.models import Sequential, load_model
from keras.utils.np_utils import to_categorical

(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = np.reshape(a=X_train, newshape=(-1,28,28,1))[:300]
X_test = np.reshape(a = X_test,newshape=(-1,28,28,1))[:300]
y_train = to_categorical(y_train)[:300]
y_test = to_categorical(y_test)[:300]

#Definition of model1 (model that uses sigmoid function for activation function)
model1 = Sequential()
model1.add(Conv2D(input_shape=(28, 28, 1), filters=32,
                 kernel_size=(2, 2), strides=(1, 1), padding="same"))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Conv2D(filters=32, kernel_size=(
    2, 2), strides=(1, 1), padding="same"))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Flatten())
model1.add(Dense(256))
model1.add(Activation('sigmoid'))
model1.add(Dense(128))
model1.add(Activation('sigmoid'))
model1.add(Dense(10))
model1.add(Activation('softmax'))


model1.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])
#Learning
history = model1.fit(X_train, y_train, batch_size=32, epochs=3, validation_data=(X_test, y_test))

#Visualization
plt.plot(history.history['acc'], label='acc', ls='-', marker='o')
plt.plot(history.history['val_acc'], label='val_acc', ls='-', marker='x')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.suptitle('model1', fontsize=12)
plt.show()


#Definition of model2 (model that uses ReLU for activation function)
model2 = Sequential()
model2.add(Conv2D(input_shape=(28, 28, 1), filters=32,
                 kernel_size=(2, 2), strides=(1, 1), padding="same"))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Conv2D(filters=32, kernel_size=(
    2, 2), strides=(1, 1), padding="same"))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Flatten())
model2.add(Dense(256))
model2.add(Activation('relu'))
#Added batch normalization below
model2.add(BatchNormalization())
model2.add(Dense(128))
model2.add(Activation('relu'))
#Added batch normalization below
model2.add(BatchNormalization())
model2.add(Dense(10))
model2.add(Activation('softmax'))


model2.compile(optimizer='sgd', loss='categorical_crossentropy',
              metrics=['accuracy'])
#Learning
history = model2.fit(X_train, y_train, batch_size=32, epochs=3, validation_data=(X_test, y_test))

#Visualization
plt.plot(history.history['acc'], label='acc', ls='-', marker='o')
plt.plot(history.history['val_acc'], label='val_acc', ls='-', marker='x')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.suptitle("model2", fontsize=12)
plt.show()

image.png

Transfer learning

Transfer learning

It takes a lot of time to train a large neural network, and it requires a lot of data. In such cases, it is useful to use a model that has already been trained and published with a large amount of data. To train a new model using a trained model

This is called "transfer learning".

In Keras, with the image classification model learned by ImageNet (a huge data set of 1.2 million images and 1000 classes) You can download and use that weight.

There are several types of published models, but here we will use a model called VGG16 as an example.

image.png

The VGG model came in second in the 2014 ILSVRC, a large-scale image recognition competition. This is a network model created by the VGG (Visual Geometry Group) team at the University of Oxford.

Convolution using a small filter is performed 2 to 4 times in a row, and then pooling is repeated. The feature is that the layer is quite deep for that time.

The VGG model has 16 layers of weighted layers (convolution layer and fully connected layer) and 19 layers. They are called VGG16 and VGG19, respectively.

VGG16 is a neural network with 13 convolution layers + 3 fully connected layers = 16 layers.

The original VGG model is a 1000 class classification model, so there are 1000 output units. By using the halfway layer as a layer for feature extraction without using the last fully connected layer It can be used for transfer learning.

Also, you don't have to worry about the size of the input image.

This is because the VGG16 model has a small convolutional layer kernel size of 3x3. Also, padding ='same', unless the input image is extremely small. This is because a certain number of features are secured through the 13 layers.

VGG16

Classify the cifar10 dataset in Keras using transfer learning. Combine the VGG16 model with the Sequential type model you have been using.

First, make a model of VGG.

from keras.applications.vgg16 import VGG16
#Pre-learned weights in ImageNet are also loaded
input_tensor = Input(shape=(32, 32, 3))
vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

input_tensor is an optional Keras tensor (that is, the output of layers.Input ()) to use as the input image for the model.

include_top is whether to use the last fully connected layer part of the original model. By setting this to False, only the feature extraction part by the convolution layer of the original model is used. You can add your own model to subsequent layers.

If imagenet is specified for weights, the weights learned by ImageNet will be used, and if None is specified, random weights will be used.

To add another layer after the feature extraction part, define a model different from VGG (top_model in this case) in advance and combine it as follows.

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(10, activation='softmax'))
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

The weight of the feature extraction part by vgg16 will collapse when it is updated, so fix it as follows.

#Up to the 20th layer of model is a vgg model
for layer in model.layers[:19]:
    layer.trainable = False

Compiling and learning can be done in the same way, but when transferring learning, it is better to select SGD as the optimization function.

model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

Click here for an implementation example

from keras import optimizers
from keras.applications.vgg16 import VGG16
from keras.datasets import cifar10
from keras.layers import Dense, Dropout, Flatten, Input
from keras.models import Model, Sequential
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt
import numpy as np

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train[:300]
X_test = X_test[:100]
y_train = to_categorical(y_train)[:300]
y_test = to_categorical(y_test)[:100]

#input_Define tensor
input_tensor = Input(shape=(32, 32, 3))

vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
top_model.add(Dense(256, activation='sigmoid'))
top_model.add(Dropout(0.5))
top_model.add(Dense(10, activation='softmax'))

#vgg16 and top_Please concatenate models
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

#Please fix the weight up to the 19th layer using the for statement.
for layer in model.layers[:19]:
    layer.trainable = False

model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])


model.load_weights('param_vgg.hdf5')

model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size=32, epochs=1)

#You can save the model weights below(Can't do here)
# model.save_weights('param_vgg.hdf5')

#Evaluation of accuracy
scores = model.evaluate(X_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

#Data visualization (first 10 sheets of test data)
for i in range(10):
    plt.subplot(2, 5, i+1)
    plt.imshow(X_test[i])
plt.suptitle("10 images of test data",fontsize=16)
plt.show()

#Prediction (first 10 sheets of test data)
pred = np.argmax(model.predict(X_test[0:10]), axis=1)
print(pred)

model.summary()

Recommended Posts

Python: Application of image recognition using CNN
Python: Basics of image recognition using CNN
Application of CNN2 image recognition
Image capture of firefox using python
Image recognition of fruits using VGG16
Image recognition using CNN Horses and deer
CNN 1 Image Recognition Basics
Application of Python 3 vars
Chord recognition using chromagram of python library librosa
python: Basics of using scikit-learn ①
Python application: Data cleansing # 3: Use of OpenCV and preprocessing of image data
I tried handwriting recognition of runes with CNN using Keras
[Python] Using OpenCV with Python (Image Filtering)
Judgment of backlit image using OpenCV
Authentication using tweepy-User authentication and application authentication (Python)
Python: Application of supervised learning (regression)
[Python] Using OpenCV with Python (Image transformation)
Removal of haze using Python detailEnhanceFilter
python x tensoflow x image face recognition
Implementation of desktop notifications using Python
Implementation of Light CNN (Python Keras)
Handwriting recognition using KNN in Python
Trial of voice recognition using Azure with Python (input from microphone)
Image recognition
Basics of binarized image processing with Python
[Python] Extension using inheritance of matplotlib (NavigationToolbar2TK)
Automatic collection of stock prices using python
About building GUI using TKinter of Python
Category estimation using docomo's image recognition API
Sample application using MongoDB of Spring Boot
(Bad) practice of using this in Python
Image recognition model using deep learning in 2016
Easy introduction of speech recognition with Python
[Python] Calculation of image similarity (Dice coefficient)
Grayscale by matrix-Reinventor of Python image processing-
Traffic Safety-kun: Recognition of traffic signs in Python
Drawing with Matrix-Reinventor of Python Image Processing-
Study on Tokyo Rent Using Python (3-1 of 3)
Analysis of X-ray microtomography image by Python
Matrix Convolution Filtering-Reinventor of Python Image Processing-
Meaning of using DI framework in Python
The result of making the first thing that works with Python (image recognition)
Time variation analysis of black holes using python
Introduction of Python
[Introduction to Udemy Python 3 + Application] 26. Copy of dictionary
I tried image recognition of CIFAR-10 with Keras-Learning-
How to code a drone using image recognition
[Python] I tried to judge the member image of the idol group using Keras
python image processing
Nogizaka recognition program (using Yolov5) Table of contents
I tried image recognition of CIFAR-10 with Keras-Image recognition-
Machine Learning: Image Recognition of MNIST by using PCA and Gaussian Native Bayes
Color page judgment of scanned image with python
[Introduction to Udemy Python 3 + Application] 19. Copy of list
Get image URL using Flickr API in Python
Start using Python
Introduction of Python Imaging Library (PIL) using HomeBrew
Character encoding when using csv module of python 2.7.3
Basics of Python ①
Basics of python ①
Image processing? The story of starting Python for