[PYTHON] [For AI beginners] I will explain mnist_transfer_cnn.py line by line (learn MNIST with Keras)

Introduction

This article is the third article scheduled for all three times. This article is just a line-by-line explanation of mnist_transfer_cnn.py. There are some overlaps with the previous one, but please note that some of the content may be duplicated to make it easier to read the article alone. It is intended for people who are interested in AI but have not touched it yet. I think that if you read this, you should be able to understand the basic learning flow of deep learning. (Originally, it was created in-house with the intention of using it for training)

  1. [For AI beginners] Explain mnist_mlp.py line by line (learn MNIST with Keras)
  2. [For AI beginners] Explain mnist_cnn.py line by line (learn MNIST with Keras)
  3. [For AI beginners] Explain mnist_transfer_cnn.py line by line (learn MNIST with Keras)

About operation check method

Since MNIST is an image, it's better to have a GPU to run this code (it's a bit painful on a CPU). The recommended method is to use Google Colaboratory. colab.gif There are only two things to do. · Open a new notebook in Python 3 · Enable GPU from runtime You can now use the GPU. Just paste the code into the cell and execute it (shortcut is CTRL + ENTER) and it will work.

About mnist

A dataset of handwritten text images, often used in machine learning tutorials. Content: Handwritten characters from 0 to 9 Image size: 28px * 28px Color: black and white Data size: 70,000 (training data 60,000, test data 10,000 images and labels are available)

What is Fine-tuning?

Use the parameters of an existing good model as initial values to handle another task. By doing this, you can expect to reduce the calculation cost and improve the accuracy.

Speaking of this time

  1. Create a model to classify 0-4 images (create the underlying weights)
  2. Fix the weight of the layer that extracts the features of the image of the created model so that it cannot be changed.
  3. Train 5 to 9 images (fully connected layer = update only the weight of the part to be classified)
  4. Finally, a model that receives 5 types of handwritten characters from 5 to 9 as input and classifies them into 5 types from 5 to 9 is completed.

*** Since we trained to classify images from 5 to 9, we cannot classify images from 0 to 4 with the model finally completed this time. *** ***

Code description

Preparation

'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''
'''Transfer learning toy example.
1 - Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
2 - Freeze convolutional layers and fine-tune dense layers
   for the classification of digits [5..9].
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
'''

#No special code needed (Python version 3 but needed if the code is written in Python 2)
from __future__ import print_function

#Import the required libraries
import datetime
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

#Get the current time
now = datetime.datetime.now

#constant
batch_size = 128   #Batch size. Data size to be learned at one time
num_classes = 5    #Number of labels to classify. This time, we will classify handwritten images into 5 types from 5 to 9.
epochs = 5         #Number of epochs. How many times to learn all data
img_rows, img_cols = 28, 28  #Number of dimensions of input image
filters = 32       #Number of convolution filters
pool_size = 2      #Max pooling size
kernel_size = 3    #Convolution filter(kernel)size
#Data shape
if K.image_data_format() == 'channels_first':
    input_shape = (1, img_rows, img_cols)
else:
    input_shape = (img_rows, img_cols, 1)

Regarding the data form, the details are [For AI beginners] mnist_cnn.py will be explained line by line (learn MNIST with Keras) To check. It is described in the data pre-processing part.

It determines that the image format is different depending on whether the Keras backend is Theano (channels_first) or tensorflow (channels_last). This time it's tensorflow, so it's (28, 28, 1).

Data preprocessing

#Read mnist data and train data(60,000 cases)And test data(10,000 cases)Divide into
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#Create datasets separated by label 5 or more or less
#Training image with label value less than 5
x_train_lt5 = x_train[y_train < 5]
#Training label with a label value less than 5
y_train_lt5 = y_train[y_train < 5]
#Test image with label value less than 5
x_test_lt5 = x_test[y_test < 5]
#Test label with a label value less than 5
y_test_lt5 = y_test[y_test < 5]

#Training image with label value 5 or higher
x_train_gte5 = x_train[y_train >= 5]
#A dataset of training labels with a label value of 5 or greater minus 5.(5-9 ⇒ 0-4)
y_train_gte5 = y_train[y_train >= 5] - 5
#Test image with label value 5 or higher
x_test_gte5 = x_test[y_test >= 5]
#A dataset of test labels with a label value of 5 or greater minus 5.(5-9 ⇒ 0-4)
y_test_gte5 = y_test[y_test >= 5] - 5

First, in order to create a model that classifies images from 0 to 4, we divide it into data from 0 to 4 and data from 5 to 9.

Next, for the data of 5 to 9, change the label from 5 to 9 to 0 to 4.

Model definition

#Model definition(.add()Patterns that do not use methods)

#Learn features with convolution
feature_layers = [
    #Convolution layer (filter: 32 sheets, filter size:(3, 3), Receive input size:(28, 28, 1))
    Conv2D(filters, kernel_size,
           # padding='valid'Do not pad with 0. 0 When padding'same'To specify
           padding='valid',
           input_shape=input_shape),
    #Activation function: Relu
    Activation('relu'),
    #Convolution layer (filter: 32 sheets, filter size:(3, 3))
    Conv2D(filters, kernel_size),
    #Activation function: Relu
    Activation('relu'),
    #Pooling layer
    MaxPooling2D(pool_size=pool_size),
    # 0.25 chances of dropout
    Dropout(0.25),
    #Convert data to one dimension
    Flatten(),
]

#Learn classification in fully connected layers
classification_layers = [
    #Fully connected layer (128 units)
    Dense(128),
    #Activation function: relu
    Activation('relu'),
    # 0.5 chance to drop out
    Dropout(0.5),
    #Fully connected layer (5 units)
    Dense(num_classes),
    #Activation function: softmax(Due to classification problems)
    Activation('softmax')
]

#Feature in Sequential class_layers and classification_Instantiate what you passed layers
model = Sequential(feature_layers + classification_layers)

This time, unlike Part 1 and Part 2, the model is defined without using the .add () method. In Keras, you can also define a model by passing an ordered list of layers to Sequential () and instantiating it.

The reason for writing this way is that *** Fine-tuning does not change (or changes) the weight of only a specific layer during learning ***. That's why I dare to write it like this.

*** By defining the layer for extracting the features of the image and the layer for classifying separately, it becomes easy to update only one of the weights ***.

Function to learn

#Create a function to train
def train_model(model, train, test, num_classes):
    
    #Data preprocessing
    #Reshape the data format to match
    x_train = train[0].reshape((train[0].shape[0],) + input_shape)  # (30596, 28, 28) -> reshape(30596, 28, 28, 1)
    x_test = test[0].reshape((test[0].shape[0],) + input_shape)     # (5139, 28, 28)  -> reshape(5139, 28, 28, 1)
    #Image data takes a value from 0 to 255, so standardize the data by dividing by 255.
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    # .astype('float32')Convert the data type with.(Otherwise you should get an error when you break)
    x_train /= 255
    x_test /= 255
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    #Label data one-hot-Vectorization
    '''one-hot-The image of vector looks like this
    label  0 1 2 3 4 
    0:    [1,0,0,0,0]
    3:    [0,0,0,1,0]'''
    y_train = keras.utils.to_categorical(train[1], num_classes)
    y_test = keras.utils.to_categorical(test[1], num_classes)

    #Set up the learning process
    model.compile(loss='categorical_crossentropy',  #Set the loss function. This time it's a classification, so categorical_crossentropy
                  optimizer='adadelta',  #The optimization algorithm is adadelta
                  metrics=['accuracy'])  #Specify the evaluation function

    #Get learning start time
    t = now()
    #To learn
    model.fit(x_train, y_train,       #Training data, labels
              batch_size=batch_size,  #Batch size (128)
              epochs=epochs,          #Number of epochs (5)
              verbose=1,              #Display the progress of learning as a bar graph in real time(Hide at 0)
              validation_data=(x_test, y_test))  #test data(To test each epoch and calculate the error)
    #Output the time taken for learning
    print('Training time: %s' % (now() - t))
    
    #Evaluation
    #Pass test data(verbose=0 does not give a progress message)
    score = model.evaluate(x_test, y_test, verbose=0)
    #Output generalization error
    print('Test score:', score[0])
    #Output generalization performance
    print('Test accuracy:', score[1])

This time, we will train twice to create a model that classifies 5 to 9 images.

Learning

#Learn using the function created above
#Learn by giving 0-4 labels to images less than 5(To classify)
train_model(model,
            (x_train_lt5, y_train_lt5),
            (x_test_lt5, y_test_lt5), num_classes)

# trainable=False to prevent layers from learning
#Feature which is the part to convolve_Set not to update the parameters of layers and classify_Update only the parameters of layers
#For the changes to take effect, compile on the model after the property changes()Need to call
for l in feature_layers:
    l.trainable = False

#Learn using the function created above
#Learn by giving 0-4 labels to 5 or more images(To classify)
train_model(model,
            (x_train_gte5, y_train_gte5),
            (x_test_gte5, y_test_gte5), num_classes)

It's finally learning.

  1. Create a model to classify 0-4 images (create the underlying weights)
  2. Fix the weight of feature_layers so that it cannot be changed
  3. Train 5 to 9 images (fully connected layer = update only the weight of the part to be classified)
  4. Finally, the model that receives 5 types of handwritten characters from 5 to 9 as input and classifies them into 5 types from 5 to 9 is completed!

in conclusion

This is the end of all three source commentary articles. The explanation is over, but as a bonus, I will post an article on how to save and load the model and how to use it next time. If you just make it and don't save it, it will disappear even though you made it.

Recommended Posts

[For AI beginners] I will explain mnist_transfer_cnn.py line by line (learn MNIST with Keras)
[For AI beginners] I will explain mnist_mlp.py line by line (learn MNIST with Keras)
[For AI beginners] I will explain mnist_cnn.py line by line (learn MNIST with Keras)
[For beginners] I made a motion sensor with Raspberry Pi and notified LINE!
I tried to move GAN (mnist) with keras
I tried the MNIST tutorial for beginners of tensorflow.
Classify mnist numbers by unsupervised learning with keras [Autoencoder]
I tried to classify MNIST by GNN (with PyTorch geometric)
For beginners, how to deal with common errors in keras
Anomaly detection by autoencoder using keras [Implementation example for beginners]