[PYTHON] Judgment whether it is my child from the photograph of Shiba Inu by deep learning (3) Visualization by Grad-CAM

Introduction

――This is the output of my own machine learning and deep learning study records. -** Deep learning to determine if you are my child from the photo of Shiba dog (1) ** and ** Deep learning from the photo of Shiba dog Judgment whether or not (2) Data increase / transfer learning / fine tuning **, the weight of the analysis result of deep learning is visualized by Google Colaboratory. ――Describe as much as possible the parts that stumbled due to various errors, and describe them so that anyone can easily reproduce them.

Target audience of this article / Referenced articles

--Target: Same as before. For details ** here **. --Reference article ** ○ Detect anomalies and visualize anomalies by implementing vgg16 and Grad-CAM with keras **

about me

--Acquired ** JDLA Deep Learning for Engeneer 2019 # 2 ** in September 2019. ――Until the end of March 2020, you will be a clerk of a public interest corporation. Career change to data engineer from April 2020. For details ** here **.

Outline of the previous analysis (2)

--The image files (jpg) to be analyzed were increased to 120 photos of pet dogs (Shiba Inu) and 120 photos of Shiba Inu (other than pet dogs), for a total of 240 photos, and they were classified into two by deep learning again. .. ――In addition, as a result of transfer learning and fine tuning with the ImageNet model (VGG16) and verification with test data, the classification accuracy improved from about 75% to about 95%.

Outline of procedure of this time (3)

** Step 1 Data conversion, model construction and learning ** ** Step 2 Implementation of Grad-CAM **

-This article by @T_Tao ** ([Anomaly detection and visualization of abnormal parts by implementing vgg16 and Grad-CAM with keras](https://qiita. The implementation of Grad-CAM was introduced at com / T_Tao / items / 0e869e440067518b6b58)) **.

――I would like to implement the code introduced in this article and continue to use the Shiba Inu photo data for visualization with Grad-CAM. It's very interesting to see what part of a dog's picture is seen as a feature by deep learning to distinguish between the two. ――In the following implementation, basically write the code of the reference article as it is. Also, the data of Google Drive set up in the previous analysis (2) will be used as it is.

Step 1 Data conversion, model construction and learning

(1) Mount Google Drive

Mount the data so that it can be read into Colab from the folder containing the image of the Shiba Inu.

#Google Drive mount
from google.colab import drive
drive.mount('/content/drive')

(2) Import the required library

Import with the following code.

#Library import
from __future__ import print_function
import keras
from keras.applications import VGG16
from keras.models import Sequential, load_model, model_from_json
from keras import models, optimizers, layers
from keras.optimizers import SGD
from keras.layers import Dense, Dropout, Activation, Flatten
from sklearn.model_selection import train_test_split  
from PIL import Image 
from keras.preprocessing import image as images
from keras.preprocessing.image import array_to_img, img_to_array, load_img
from keras import backend as K 
import os
import numpy as np  
import glob  
import pandas as pd
import cv2

(3) Data preprocessing

Until the last time, I used keras' Image Data Generator to convert image data. This time, instead of using the Image Data Generator, use the following code to convert the image file to a tensor.

# cd '/content/drive/'My Drive/'Colab Notebooks'Move to working folder in
%cd '/content/drive/'My Drive/Colab Notebooks/Self_Study/02_mydog_or_otherdogs/

num_classes = 2 #Number of classes
folder = ["mydog2", "otherdogs2"] #Folder name where photo data is stored
image_size = 312 #The size of a piece of input image
x = []
y = []

for index, name in enumerate(folder):
    dir = "./original_data/" + name
    files = glob.glob(dir + "/*.jpg ")    
    for i, file in enumerate(files):    
        image = Image.open(file)                       
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        data = np.asarray(image) 
        x.append(data)  
        y.append(index) 
#If you want to convert the list to a Numpy array, np.array、np.There are two types, asarray.
#If you want to make a copy of the Numpy array, np.Use array.
#If you want to make a copy that stays in sync with the original Numpy array, np.Use asarray.

x = np.array(x)   
y = np.array(y)

(4) Confirmation of converted data

Let's check how the data is converted and stored in x and y.

#Check the contents of x
display(x)

The contents of x that converted the image data are converted into the following list.

array([[[[114, 109, 116],
         [116, 111, 118],
         [104,  99, 105],
         ...,
         [ 37,  38,  30],
         [ 37,  38,  30],
         [ 36,  37,  29]],

        [[117, 112, 119],
         [120, 115, 121],
         [110, 105, 111],
         ...,
         [ 37,  38,  30],
         [ 37,  38,  30],
         [ 37,  38,  30]],

        [[118, 113, 120],
         [121, 116, 122],
         [114, 109, 115],
         ...,
         [ 37,  38,  30],
         [ 38,  39,  31],
         [ 38,  39,  31]],
(Omitted)

       ...,

        [[ 60,  56,  53],
         [ 60,  56,  53],
         [ 61,  57,  54],
         ...,
         [105,  97,  84],
         [105,  97,  84],
         [104,  96,  83]]]], dtype=uint8)

'\n[[[0, 0, 0],\n         [0, 0, 0],\n         [0, 0, 0],\n         ...,\n         [0, 0, 0],\n         [0, 0, 0],\n         [0, 0, 0]],\n'

Check the contents of y (label).

#Check the contents of y
y

y is generated with two types of labels, "0" and "1".

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

(5) Divide the converted data into train data and test data

The converted tensor is split by sklearn's train_test_split for training with the model built later.

#Divided into train data and test data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

(6) Convert labels to one-hot representation

#Changed the label to one-hot expression
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

You should see a result similar to the following:

192 train samples
48 test samples

(7) Model construction and compilation

Build the model with the following code. This time the optimizer specifies SDG (stochastic gradient descent).

vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))
last = vgg_conv.output

mod = Flatten()(last)
mod = Dense(1024, activation='relu')(mod)
mod = Dropout(0.5)(mod)
preds = Dense(2, activation='sigmoid')(mod)

model = models.Model(vgg_conv.input, preds)
model.summary()

epochs = 100
batch_size = 48

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

(7) Model learning

Train the model.

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_data=(x_test, y_test),
                    shuffle=True)

Save the model.

model.save('mydog_or_otherdogs3(Grad-Cam).h5')

(8) Evaluation of trained model and transition of accuracy & loss

Display the result with the following code and draw a graph. The validation result is also high probably because the image file uses all the data (240 sheets).

#score display
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

#Accuracy and loss plot
import matplotlib.pyplot as plt 

acc = history.history["acc"]
val_acc = history.history["val_acc"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)

plt.plot(epochs, acc, label = "Training acc" )
plt.plot(epochs, val_acc, label = "Validation acc")
plt.title("Training and Validation accuracy")
plt.legend()
plt.show()

plt.plot(epochs, loss,  label = "Training loss" )
plt.plot(epochs, val_loss, label = "Validation loss")
plt.title("Training and Validation loss")
plt.legend()
plt.show()

The result is as follows.

Test loss: 0.04847167782029327
Test accuracy: 0.9795918367346939

traial07.png

Step 2 Implementation of Grad-CAM

(1) Implementation of Grad-CAM

Enter the code below. According to @T_Tao, ** Grad-CAM with a model made by myself with keras * It is based on the * code.

K.set_learning_phase(1) #set learning phase

def Grad_Cam(input_model, pic_array, layer_name):

    #Preprocessing
    pic = np.expand_dims(pic_array, axis=0)
    pic = pic.astype('float32')
    preprocessed_input = pic / 255.0

    #Prediction class calculation
    predictions = input_model.predict(preprocessed_input)
    class_idx = np.argmax(predictions[0])
    class_output = input_model.output[:, class_idx]

    #Get the gradient
    conv_output = input_model.get_layer(layer_name).output   # layer_Name layer output
    grads = K.gradients(class_output, conv_output)[0]  # gradients(loss, variables)Returns the gradient with respect to the loss of variables
    gradient_function = K.function([input_model.input], [conv_output, grads])  # input_model.When you enter input, conv_Function to output output and grads

    output, grads_val = gradient_function([preprocessed_input])
    output, grads_val = output[0], grads_val[0]

    #Average the weights and multiply by the output of the layer
    weights = np.mean(grads_val, axis=(0, 1))
    cam = np.dot(output, weights)

    #Image and combine as a heat map
    cam = cv2.resize(cam, (312, 312), cv2.INTER_LINEAR) 
    cam = np.maximum(cam, 0) 
    cam = cam / cam.max()

    jetcam = cv2.applyColorMap(np.uint8(255 * cam), cv2.COLORMAP_JET)  #Pseudo-color monochrome images
    jetcam = cv2.cvtColor(jetcam, cv2.COLOR_BGR2RGB)  #Convert color to RGB
    jetcam = (np.float32(jetcam) + pic / 2)   #Combined with the original image
    return jetcam

Let's apply the results to some Shiba Inu photos. First of all, from my dog.

# cd '/content/drive/'My Drive/'Colab Notebooks'Move to the specified folder in
%cd '/content/drive/'My Drive/Colab Notebooks/

pic_array = img_to_array(load_img('/content/drive/My Drive/Colab Notebooks/Self_Study/02_mydog_or_otherdogs/original_data/mydog2/mydog1.jpg', target_size=(312, 312)))
pic = pic_array.reshape((1,) + pic_array.shape)
array_to_img(pic_array)

gradcam01b.png

Overlay the heatmap

picture = Grad_Cam(model, pic_array, 'block5_conv3')
picture = picture[0,:,:,]
array_to_img(picture)

gradcam01.png

Is it like that? By visualizing with the heat map of Grad-CAM, it was drawn which side the deep learning is looking at as a feature. The part where the color is redder is the part that greatly contributes to the loss of the prediction class (the part with a large gradient), but after all it is the part that hits from under the eyes to the nose of the face, etc. I wondered if I was looking at the part where the individuality appeared on the face. What was a little surprising is that the color of the heat map is darker between the eyes and ears (there?).

I also applied about 2 photos of Mirin and 3 photos of other Shiba Inu and arranged them. gradcam02b.pnggradcam02.png gradcam03b.pnggradcam03.png

gradcam11b.pnggradcam11.png gradcam12b.pnggradcam12.png gradcam13b.pnggradcam13.png

Some images look similar (from the eyes to the nose), while others look completely different, which is quite interesting. I think there seems to be a general tendency for the featured parts, but it may be a little difficult to explain with this heat map alone.

This time, we created a heat map using Grad-CAM. There seem to be various other methods for visualizing the feature parts, such as Grad-CAM ++ and Guided-Grad-CAM, so I would like to try various methods from the next time onwards.

Recommended Posts

Judgment whether it is my child from the photograph of Shiba Inu by deep learning (3) Visualization by Grad-CAM
Judge whether it is my child from the picture of Shiba Inu by deep learning (1)
Judging whether or not it is my child from the picture of Shiba Inu by deep learning (2) Data increase, transfer learning, fine tuning
Voice processing by deep learning: Let's identify who the voice actor is from the voice
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
The copy method of pandas.DataFrame is deep copy by default
Evaluate the accuracy of the learning model by cross-validation from scikit learn
Find out the name of the method that called it from the method that is python