Introduction to AI creation with Python! Part 3 I tried to classify and predict images with a convolutional neural network (CNN)

About this article

Continued from the previous article .

I would like to classify images using a convolutional neural network (CNN). In Part 1 of this article, we classified handwritten numeric images using a neural network. CNN allows for more accurate classification.

What is a Convolutional Neural Network (CNN)?

It is one of the most commonly used deep learning models when working with images. In addition to normal neural networks It is called a "convolutional neural network" because it adds a process called "convolution".

What is "convolution processing"?

In recent years, smartphone cameras have also become high quality, and one camera has several MB. If you use this for learning as it is, it will take a lot of time because it has too much capacity. To increase the efficiency of learning, you need to reduce the size of the image.

However, simply reducing the capacity is not enough. If you make it smaller and the features of the image disappear I don't know what the image is and it doesn't make sense.

__ Convolution processing is to compress while retaining the characteristics of the original image data __.

Specifically, the procedure is as follows.

  1. The "convolution layer" decomposes the image into parts called "kernels".
  2. Create a "feature map" by multiplying several "kernels".
  3. Make the created "feature map" smaller with the "pooling layer".

Finally, convert it to one-dimensional array data and The flow is to learn with a neural network.

About the construction environment

In this article, we will run it in a Google Colaboratory environment. Also, the version of tensorflow is 1.13.1. If you want to downgrade, you can use the following command. !pip install tensorflow==1.13.1

Library import

I will use tensorflow.keras this time as well.

from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Activation, Dense, Dropout, Conv2D, Flatten, MaxPool2D
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Image data preparation

Image data is downloaded with the cifar10 library. (train_images, train_labels) is the training image and correct label (test_images, test_labels) is the image and correct label for verification.

(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

Check the shape of the dataset. You can see that there are 50,000 32-pixel RGB images (32 x 32 x 3) for training and 10,000 for verification. image.png

Let's check the contents of the image as well. image.png

Correct label of the image ↓ image.png

The meaning of each number is as follows.

Label "0": airplane Label "1": automobile Label "2": bird Label "3": cat Label "4": deer Label "5": dog Label "6": frog Label "7": horse Label "8": ship Label "9": truck

Data set preprocessing

The contents of train_images are as follows Contains numbers from 0 to 255. (Because of RGB) image.png To normalize this, divide it uniformly by 255.

In a normal neural network, I had to change the training data to one dimension, Since it is necessary to input 3D data in the convolution process, only the normalization process is OK.

train_images = train_images.astype('float32')/255.0
test_images = test_images.astype('float32')/255.0

Also, change the correct label to One-Hot expression with to_categorical.

train_labels = to_categorical(train_labels, 10)
test_labels = to_categorical(test_labels, 10)

Modeling

The modeling is the following code.

model = Sequential()

#1st convolution process (Conv → Conv → Pool → Dropout)
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

#Second convolution process (Conv → Conv → Pool → Dropout)
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

#Classification by neural network (Flatten → Dense → Dropout → Dense)
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

I will explain each one. First, create a sequential model with model = Sequential ().

Next is the convolution process. This time, after doing the convolution process twice, I would like to classify by neural network.

I will explain the first convolution process. First, with model.add (Conv2D (32, (3, 3), activation ='relu', padding ='same', input_shape = (32, 32, 3))) Create a convolution layer. We are passing the number of kernels, kernel size, activation function, padding, and input size to Conv2D. The number of kernels is 32, the size is 3x3, the activation function is relu, and the padding same is the process of enclosing the created feature map in 0.

The feature map created by the above process With model.add (Conv2D (32, (3, 3), activation ='relu', padding ='same')) In addition, create a feature map that extracts features in the convolution layer.

Next is the pooling layer. Compress the image with model.add (MaxPool2D (pool_size = (2, 2))). MaxPool2D is a method called MAX pooling. The size is the size after compression.

Finally, model.add (Dropout (0.25)) to invalidate with dropout, The first convolution process is complete.

Do the same process again and then Convert to one dimension with model.add (Flatten ()) and Performs classification prediction of ordinary neural networks.

Conversion process to TPU model

Before compiling the model, Convert the created model to a TPU model.

You can compile and learn as it is, Convolutional neural networks require a huge amount of computation, so It takes a lot of time if it is not processed by TPU.

Follow the steps below to convert.

#Conversion to TPU model
import tensorflow as tf
import os
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    )
)

Compiling the model

Loss function is suitable for classification categorical_crossentopy, Set the activation function to Adam (learning rate is 0.001) and the evaluation index to acc (correct answer rate).

tpu_model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.001), metrics=['acc'])

Learning

You will learn with the created model. When learning with the TPU model, the first time it takes a lot of time, but the second and subsequent times are fast. If you train with a normal model instead of TPU, it will take more than twice as long.

history = tpu_model.fit(train_images, train_labels, batch_size=128,
    epochs=20, validation_split=0.1)

Graph display of learning results

The correct answer rate seems to exceed 90%. The accuracy is quite high.

plt.plot(history.history['acc'], label='acc')
plt.plot(history.history['val_acc'], label='val_acc')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(loc='best')
plt.show()

image.png

Evaluation of learning

When I tried it with the verification data, the correct answer rate dropped to 71.2%. If it is a new image, the accuracy is not so high, so there seems to be room for improvement.

test_loss, test_acc = tpu_model.evaluate(test_images, test_labels)
print('loss: {:.3f}\nacc: {:.3f}'.format(test_loss, test_acc ))

image.png

inference

Finally, inference. Pass the image and check what kind of prediction is made. Google Colab's TPU is composed of 8 cores, You have to study by a number divisible by 8. Therefore, I would like to set the training data to 16.

#Display of inferred image
for i in range(16):
    plt.subplot(2, 8, i+1)
    plt.imshow(test_images[i])
plt.show()

#Display the inferred label
test_predictions = tpu_model.predict(test_images[0:16])
test_predictions = np.argmax(test_predictions, axis=1)[0:16]
labels = ['airplane', 'automobile', 'bird', 'cat', 'deer',
        'dog', 'frog', 'horse', 'ship', 'truck']
print([labels[n] for n in test_predictions])

image.png

The image is small and difficult to understand, It seems that you can predict it. Next time, I would like to predict the same image data with a CNN called ResNet.

Recommended Posts

Introduction to AI creation with Python! Part 3 I tried to classify and predict images with a convolutional neural network (CNN)
Introduction to AI creation with Python! Part 2 I tried to predict the house price in Boston with a neural network
Introduction to AI creation with Python! Part 1 I tried to classify and predict what the numbers are from the handwritten number images.
I tried a convolutional neural network (CNN) with a tutorial on TensorFlow on Cloud9-Classification of handwritten images-
I tried to make a periodical process with Selenium and Python
I tried to predict next year with AI
I tried to make a periodical process with CentOS7, Selenium, Python and Chrome
[ES Lab] I tried to develop a WEB application with Python and Flask ②
I tried to draw a route map with Python
I tried to automatically generate a password with Python3
I made a program to convert images into ASCII art with Python and OpenCV
[Introduction to system trading] I drew a Stochastic Oscillator with python and played with it ♬
I tried to make creative art with AI! I programmed a novelty! (Paper: Creative Adversarial Network)
I tried to implement a basic Recurrent Neural Network model
I tried to predict and submit Titanic survivors with Kaggle
I tried to make GUI tic-tac-toe with Python and Tkinter
I tried to classify music major / minor on Neural Network
I made a server with Python socket and ssl and tried to access it from a browser
I also tried to imitate the function monad and State monad with a generator in Python
Neural network with OpenCV 3 and Python 3
[5th] I tried to make a certain authenticator-like tool with python
[2nd] I tried to make a certain authenticator-like tool with python
[3rd] I tried to make a certain authenticator-like tool with python
[Python] A memo that I tried to get started with asyncio
I tried to create a list of prime numbers with python
I tried to make a 2channel post notification application with Python
[Introduction] I want to make a Mastodon Bot with Python! 【Beginners】
I tried to create Bulls and Cows with a shell program
I tried to make a todo application using bottle with python
[4th] I tried to make a certain authenticator-like tool with python
I tried to easily detect facial landmarks with python and dlib
[1st] I tried to make a certain authenticator-like tool with python
I tried to automatically collect images of Kanna Hashimoto with Python! !!
Python: I tried to make a flat / flat_map just right with a generator
I tried to communicate with a remote server by Socket communication with Python.
I tried to create a program to convert hexadecimal numbers to decimal numbers with python
A super introduction to Django by Python beginners! Part 3 I tried using the template file inheritance function
[Python] Introduction to CNN with Pytorch MNIST
I tried to make a traffic light-like with Raspberry Pi 4 (Python edition)
A super introduction to Django by Python beginners! Part 2 I tried using the convenient functions of the template
I tried a functional language with Python
I tried to discriminate a 6-digit number with a number discrimination application made with python
I tried to make Kana's handwriting recognition Part 2/3 Data creation and learning
I made a network to convert black and white images to color images (pix2pix)
A beginner of machine learning tried to predict Arima Kinen with python
I tried fMRI data analysis with python (Introduction to brain information decoding)
[Outlook] I tried to automatically create a daily report email with Python
I tried to build a Mac Python development environment with pythonz + direnv
I tried to create a sample to access Salesforce using Python and Bottle
I tried to refactor the template code posted in "Getting images from Flickr API with Python" (Part 2)
I tried to control the network bandwidth and delay with the tc command
A super introduction to Django by Python beginners! Part 5 I made a super simple diary application with a class-based general-purpose view
[Python] I tried to solve 100 past questions that beginners and intermediates should solve [Part 5/22]
I tried to make a simple mail sending application with tkinter of Python
[Python] I tried to solve 100 past questions that beginners and intermediates should solve [Part7 / 22]
I tried to automate internal operations with Docker, Python and Twitter API + bonus
[Patent analysis] I tried to make a patent map with Python without spending money
[Python] A junior high school student implemented Perceptron and tried to classify irises.
When I tried to create a virtual environment with Python, it didn't work
[Python] I tried to solve 100 past questions that beginners and intermediates should solve [Part 4/22]
[Python] I tried to solve 100 past questions that beginners and intermediates should solve [Part3 / 22]