[PYTHON] Judgment whether it is my child from the photograph of Shiba Inu by deep learning (4) Visualization by Grad-CAM and Guided Grad-CAM

Introduction

――This is the output of my own machine learning and deep learning study records. --So far ** Deep learning to determine if you are my child from the photo of Shiba dog (1) ** and ** Deep learning from the photo of Shiba dog Judgment whether or not it is a child of (2) Data increase / transfer learning / fine tuning ** and ** Deep learning to determine whether or not it is a child of Shiba dog (3) Visualization by Grad-CAM **, we will visualize the weight of deep learning analysis results with another approach using Google Colaboratory. ――Describe as much as possible the parts that stumbled due to various errors, and describe them so that anyone can easily reproduce them.

Target audience of this article / Referenced articles

--Target: Same as before. For details ** here **. --Reference article ** ○ I will explain the source code of Grad-CAM and Guided Grad-CAM in detail in Japan (Keras implementation) **

about me

--Acquired ** JDLA Deep Learning for Engeneer 2019 # 2 ** in September 2019. ――Until the end of March 2020, you will be a clerk of a public interest corporation. Career change to data engineer from April 2020.

Outline of the previous analysis (3)

--We implemented Grad-CAM and created a heat map for the features that are the basis for the classification of Shiba Inu photographs.

Outline of procedure of this time (4)

** Procedure 1 Preparation ** ** Step 2 Register the functions required to implement Grad-CAM and Guided Grad-CAM ** ** Step 3 Implementation of Grad-CAM main processing ** ** Step 4 Implementation of the main processing of Guided Grad-CAM **

-@kinziro's article ** (I will explain the source code of Grad-CAM and Guided Grad-CAM in the most detail in Japan (Keras implementation))) ** introduced the implementation of Guided Grad-CAM. This is ** github.com **. ――Implement the code introduced in this article for the purpose of working first. I would like to continue to use the Shiba Inu photo data for visualization with Grad-CAM and visualization with Guided Grad-CAM. By comparing the display results of both, we will deepen our observation on how to express the features. ――In the following implementation, basically write the code of the reference article as it is. Also, use the Google Drive data set up in the previous analysis (2) as it is.

Step 1 Preparation

(1) Mount Google Drive

Mount the data so that it can be read into Colab from the folder containing the image of the Shiba Inu.

#Google Drive mount
from google.colab import drive
drive.mount('/content/drive')

(2) Import the required library

Import with the following code.

#Library import
from __future__ import print_function
import keras
from keras.applications import VGG16
from keras.models import Sequential, load_model, model_from_json
from keras import models, optimizers, layers
from keras.optimizers import SGD
from keras.layers import Dense, Dropout, Activation, Flatten
from sklearn.model_selection import train_test_split  
from PIL import Image 
from keras.preprocessing import image as images
from keras.preprocessing.image import array_to_img, img_to_array, load_img
from keras import backend as K 
import os
import numpy as np  
import glob  
import pandas as pd
import cv2

(3) Keras version check

Let's check the version of imported keras (as of January 19, 2020 when this article is written) It is this version)

print(keras.__version__)

2.2.5

There is a point to note here. The content of interest was written in ** Comment ** attached to the original article, but the version of keras is ** 2.2 If it is not less than .4 **, it seems that an error will occur when executing source coat. When I tried it, I got the following error message on the last line of the series of source code execution. After all it seems that you have to lower the version of keras and then execute it.

AttributeError: module 'keras.backend' has no attribute 'image_dim_ordering'

(4) Lower the version of keras

Run the following code.

#Specific version of keras(2.2.4)Change to
#Runtime needs to be restarted after execution
!pip install keras==2.2.4

When executed, the following screen will appear and the library will be changed to the specified version. However, as the brown text warns me, a runtime restart is required for this change to take effect. 20200119-01.png

To restart the runtime, click "Restart Runtime" from "Runtime" on the menu bar. 20200119-02.png

(5) Retry steps (1) to (3) in step 1.

After restarting the runtime, repeat all the previous steps in step 1 and make sure you have a keras version of 2.2.4.

print(keras.__version__)

2.2.4

Step 2 Register the functions required to implement Grad-CAM and Guided Grad-CAM

(1) Execute the function definition code published in the original article

Run all the code below.

def target_category_loss(x, category_index, nb_classes):
    return tf.multiply(x, K.one_hot([category_index], nb_classes))

def target_category_loss_output_shape(input_shape):
    return input_shape

def normalize(x):
    # utility function to normalize a tensor by its L2 norm
    return x / (K.sqrt(K.mean(K.square(x))) + 1e-5)

def load_image(path):
    #img_path = sys.argv[1]
    img_path = path
    #Read the image file specified by the argument
    #The size is resized to VGG16's default of 224x224
    img = image.load_img(img_path, target_size=(224, 224))
    #Convert the read PIL format image to array
    x = image.img_to_array(img)
    #3D tensor (rows), cols, channels)To
    #4D tensor(samples, rows, cols, channels)Conversion to
    #Since there is only one input image, samples=1 is fine
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    return x


def register_gradient():
    #If GuidedBackProp is not registered, register
    if "GuidedBackProp" not in ops._gradient_registry._registry:
        #Decorator to register your own gradient
        #This time_GuidedBackProp function"GuidedBackProp"Register as
        @ops.RegisterGradient("GuidedBackProp")
        def _GuidedBackProp(op, grad):
            '''Of the gradients that have propagated back, forward propagation/Backpropagation is performed by setting only cells with a negative back propagation value to 0'''
            dtype = op.inputs[0].dtype
            # grad :Backpropagation gradient
            # tf.cast(grad > 0., dtype) :1 for cells with grad of 0 or more,Cells below 0 are a matrix of 0
            # tf.cast(op.inputs[0] > 0., dtype) :1 for cells with 0 or more inputs,Cells below 0 are a matrix of 0
            return grad * tf.cast(grad > 0., dtype) * \
                tf.cast(op.inputs[0] > 0., dtype)


def compile_saliency_function(model, activation_layer='block5_conv3'):
    '''Create a function to calculate the input gradient for the maximum channel direction of the specified layer'''
    #Model input
    input_img = model.input
    #Keep the layer after the input layer as a dictionary of layer names and instances
    layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]])
    #Get the output of the instance with the layer name specified by the argument shape=(?, 14, 14, 512)
    layer_output = layer_dict[activation_layer].output
    #Shape that takes the maximum value in the channel direction=(?, 14, 14)
    max_output = K.max(layer_output, axis=3)
    #A function that calculates the input gradient for the maximum channel direction of a specified layer
    saliency = K.gradients(K.sum(max_output), input_img)[0]
    return K.function([input_img, K.learning_phase()], [saliency])


def modify_backprop(model, name):
    '''Gradient of ReLU function"name"Replace with gradient'''

    #ReLU in with"name"Replaced by
    g = tf.get_default_graph()
    with g.gradient_override_map({'Relu': name}):

        #▽▽▽▽▽ Question 4:Is it necessary to replace the relu of the argument model even though the new model is being returned?? ▽▽▽▽▽

        #Extract and arrange only the layers that have activation
        # get layers that have an activation
        layer_dict = [layer for layer in model.layers[1:]
                      if hasattr(layer, 'activation')]

        #Replaced keras RelU with tensorflow ReLU
        # replace relu activation
        for layer in layer_dict:
            if layer.activation == keras.activations.relu:
                layer.activation = tf.nn.relu

        # △△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△△

        #Instantiate a new model
        #Modify here when using your own model
        # re-instanciate a new model
        new_model = VGG16(weights='imagenet')
    return new_model

def deprocess_image(x):
    '''
    Same normalization as in:
    https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py

    '''
    if np.ndim(x) > 3:
        x = np.squeeze(x)
    # normalize tensor: center on 0., ensure std is 0.1
    x -= x.mean()
    x /= (x.std() + 1e-5)
    x *= 0.1

    # clip to [0, 1]
    x += 0.5
    x = np.clip(x, 0, 1)

    # convert to RGB array
    x *= 255
    if K.image_dim_ordering() == 'th':
        x = x.transpose((1, 2, 0))
    x = np.clip(x, 0, 255).astype('uint8')
    return x

def grad_cam(input_model, image, category_index, layer_name):
    '''
    Parameters
    ----------
    input_model : model
Keras model to evaluate
    image :tuple etc.
Input image(Number of sheets,Vertical,side,Channel)
    category_index : int
Input image classification class
    layer_name : str
Layer name of activation layer after last conv layer.
If activation is specified in the last conv layer, the layer name of the conv layer.
        batch_If activation is not specified in the conv layer, such as when using normalization,
Layer name of the activation layer after that.

    Returns
    ----------
    cam : tuple
        Grad-Cam image
    heatmap : tuple
Heat map image
    '''
    #Number of classification classes
    nb_classes = 1000

    # ----- 1.Calculate the prediction class of the input image-----

    #Input category_index is expected class

    # ----- 2.Calculate Loss for prediction class-----

    #Input data x category_Definition of processing to set 0 other than the index specified by index
    target_layer = lambda x: target_category_loss(x, category_index, nb_classes)

    #Argument input_target after the output layer of model_layer Add layer
    #If you predict the model, the values other than the prediction class will be 0.
    x = input_model.layers[-1].output
    x = Lambda(target_layer, output_shape=target_category_loss_output_shape)(x)
    model = keras.models.Model(input_model.layers[0].input, x)

    #Since the values other than the prediction class are 0, sum is taken and only the value of the prediction class is extracted.
    loss = K.sum(model.layers[-1].output)
    #Argument layer_name layer(Last conv layer)Get the output of
    conv_output = [l for l in model.layers if l.name is layer_name][0].output

    # ----- 3.Backpropagation from the prediction class Loss to the last conv layer(Slope)Calculate-----

    #Define a function to calculate the gradient from the value of the expected class to the last conv layer
    #Of the defined function
    #input: [Image you want to judge.shape=(1, 224, 224, 3)]、
    #output: [Output value of the last conv layer.shape=(1, 14, 14, 512),Gradient from expected class value to last conv layer.shape=(1, 14, 14, 512)]
    grads = normalize(K.gradients(loss, conv_output)[0])
    gradient_function = K.function([model.layers[0].input], [conv_output, grads])

    #Calculate with the defined gradient calculation function and format the dimensions of the data
    #After plastic surgery
    # output.shape=(14, 14, 512), grad_val.shape=(14, 14, 512)
    output, grads_val = gradient_function([image])
    output, grads_val = output[0, :], grads_val[0, :, :, :]

    # ----- 4.Calculate the average gradient for each channel in the last conv layer and the importance of each channel(weight)To-----

    # weights.shape=(512, )
    # cam.shape=(14, 14)
    #* Question 1: Does cam initialization need not be zeros??
    weights = np.mean(grads_val, axis = (0, 1))
    cam = np.ones(output.shape[0 : 2], dtype = np.float32)
    #cam = np.zeros(output.shape[0 : 2], dtype = np.float32)    #Use this in my own model

    # ----- 5.The forward propagation output of the last conv layer is weighted for each channel, added together, and passed through ReLU.-----

    #The forward propagation output of the last conv layer is weighted for each channel and added together.
    for i, w in enumerate(weights):
        cam += w * output[:, :, i]

    #Resize to the size of the input image(14, 14) → (224, 224)
    cam = cv2.resize(cam, (224, 224))
    #Replace negative values with 0. The processing is the same as ReLU.
    cam = np.maximum(cam, 0)
    #Value 0~Normalized to 1.
    #* Question 2: (cam - np.min(cam))/(np.max(cam) - np.min(cam))Is it not necessary?
    heatmap = cam / np.max(cam)
    #heatmap = (cam - np.min(cam))/(np.max(cam) - np.min(cam))    #Use this in my own model

    # ----- 6.Multiply the input image and heatmap-----

    #Input image image value is 0~Normalized to 255. image.shape=(1, 224, 224, 3) → (224, 224, 3)
    #Return to BGR [0..255] from the preprocessed image
    image = image[0, :]
    image -= np.min(image)
    #* Question 3: np.uint8(image / np.max(image))Doesn't have to be?
    image = np.minimum(image, 255)

    #Set the heatmap value to 0~Color map to 255(3 channels)
    cam = cv2.applyColorMap(np.uint8(255*heatmap), cv2.COLORMAP_JET)
    #Addition of input image and heatmap
    cam = np.float32(cam) + np.float32(image)
    #Value 0~Normalized to 255
    cam = 255 * cam / np.max(cam)
    return np.uint8(cam), heatmap

Step 3 Implementation of Grad-CAM main processing

(1) Specifying the input image

Enter the code below. The image can be any image. (Specify mydog7.jpg in this example)

# cd '/content/drive/'My Drive/'Colab Notebooks'Move to working folder in
%cd '/content/drive/'My Drive/Colab Notebooks/Self_Study/02_mydog_or_otherdogs/

#① Read input image
#Change here to convert the input image
# preprocessed_input = load_image(sys.argv[1])
preprocessed_input = load_image("./use_data/train/mydog/mydog07.jpg ")

In this example, I specified this image. mydog7.jpg

(2) Loading VGG16 model

--Load the VGG16 model. This time, we will use the model trained with ImageNet as it is. (I would like to implement the weights that determine the difference between my child and other children at another time.)

#② Load the model
#Change here if you use your own model
model = VGG16(weights='imagenet')
model.summary()

The model should look like this:

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

(3) Enter the remaining steps.

Run the following code.

#③ Prediction probability of input image(predictions)And prediction class(predicted_class)Calculation
#Top when using models other than VGG16_1=~3 lines from comment out
predictions = model.predict(preprocessed_input)
top_1 = decode_predictions(predictions)[0][0]
print('Predicted class:')
print('%s (%s) with probability %.2f' % (top_1[1], top_1[0], top_1[2]))

predicted_class = np.argmax(predictions)

# ④ Grad-Cam calculation
#In the case of a self-made model, the argument"block5_conv3"Changed to the layer name of the final conv layer of the self-made model.
cam, heatmap = grad_cam(model, preprocessed_input, predicted_class, "block5_conv3")

#⑤ Save the image
cv2.imwrite("gradcam.jpg ", cam)

A Grad-CAM heatmap image will be generated in the ** 02_mydog_or_otherdogs ** folder. gradcam2.jpg

Step 4 Implementation of the main processing of Guided Grad-CAM

(1) Specifying the input image

Execute all the following code.

#① Implementation of gradient for Guided Back Propagation
register_gradient()
#(2) Change the gradient calculation of ReLU to the gradient calculation of Guided Back Propagation.
guided_model = modify_backprop(model, 'GuidedBackProp')
#③ Definition of function for GaidedBackPropagation calculation
#When using your own class, specify the layer name of the last conv layer additionally to this argument
saliency_fn = compile_saliency_function(guided_model)
#④ Calculation of Gaided Back Propagation
saliency = saliency_fn([preprocessed_input, 0])
# ⑤ Guided Grad-CAM calculation
gradcam = saliency[0] * heatmap[..., np.newaxis]
#⑥ Save image
cv2.imwrite("guided_gradcam.jpg ", deprocess_image(gradcam))

Guided Grad-CAM heatmap images will be generated in the ** 02_mydog_or_otherdogs ** folder. guided_gradcam2.jpg

It comes out like this. Certainly, I feel that it captures features in a form that is easier for humans to understand, rather than looking at a heat map.

Below, I will post some processed by mydog and otherdogs. mydog gradcam3.jpgguided_gradcam3.jpg gradcam4.jpgguided_gradcam4.jpg gradcam5.jpgguided_gradcam5.jpg gradcam6.jpgguided_gradcam6.jpg otherdogs gradcam11.jpgguided_gradcam11.jpg gradcam12.jpgguided_gradcam12.jpg gradcam13.jpgguided_gradcam13.jpg gradcam14.jpgguided_gradcam14.jpg

This time, we created an image that captures the features of Grad-CAM and Guided Grad-CAM. Since the expression of features in each method that appears in the image is very different, when explaining the features captured by deep learning, it is better to use various perspectives as much as possible and seek comprehensive understanding. It seems to be possible. I would like to continue to output about various methods.

Recommended Posts

Judgment whether it is my child from the photograph of Shiba Inu by deep learning (4) Visualization by Grad-CAM and Guided Grad-CAM
Judgment whether it is my child from the photograph of Shiba Inu by deep learning (3) Visualization by Grad-CAM
Judge whether it is my child from the picture of Shiba Inu by deep learning (1)
Judging whether or not it is my child from the picture of Shiba Inu by deep learning (2) Data increase, transfer learning, fine tuning
Voice processing by deep learning: Let's identify who the voice actor is from the voice
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Parallel learning of deep learning by Keras and Kubernetes
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
The copy method of pandas.DataFrame is deep copy by default
(Deep learning) Images were collected from the Flickr API and discriminated by transfer learning with VGG16.
Negative / positive judgment of sentences and visualization of grounds by Transformer
Negative / positive judgment of sentences by BERT and visualization of grounds
I tried to use deep learning to extract the part where the plant is shown from the photo of the balcony, but it didn't work, so I will summarize the contents of trial and error. Part 2