[PYTHON] I tried to implement Grad-CAM with keras and tensorflow

Introduction

This time, I implemented GradCAM on my own CNN model. I'm using Google Colaboratory, but I think it can be used locally or with jupyter notebook.

The CNN model will be AlexNet, which has been implemented and trained using keras.

Since the image is the one used in the actual research, it cannot be published, so please read it according to the environment if you want to refer to it.

What is Grad-CAM?

What is Grad-CAM in the first place? Who is Where in the image is deep learning looking! ?? Verify the difference between "okonomiyaki" and "pizza" on CNN Please take a closer look at the article.

Target audience

――I know there is Grad-CAM, but how do you actually implement it? People who are --People suffering from tensorflow version --People who have seen other articles but are in trouble with the error ↓ (I don't know ...)

> RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

version

Check the version before moving on to implementation Put the following code in the cell and check the version

import tensorflow as tf
import keras
print('tensorflow version: ', tf.__version__)
print('keras version: ', keras.__version__)

Execution result

tensorflow version:  2.3.0
keras version:  2.4.3

Implementation

Let's do what we need first Mount Google Drive with the command ↓.

from google.colab import drive
drive.mount('/content/drive')

Define the path of the current directory to load the image or model. Please change the following My Drive according to your environment.

current_directory_path = '/content/drive/My Drive/Research/AlexNet/'

Import required modules

import numpy as np
import cv2

#For images
from keras.preprocessing.image import array_to_img, img_to_array, load_img
#For model loading
from keras.models import load_model
#For Grad-CAM calculation
from tensorflow.keras import models
import tensorflow as tf

Definition of constants Please change here as well according to your environment.

IMAGE_SIZE  = (32, 32)

Method to calculate Grad-CAM

def grad_cam(input_model, x, layer_name):
    """
    Args: 
        input_model(object):Model object
        x(ndarray):image
        layer_name(string):The name of the convolution layer
    Returns:
        output_image(ndarray):Colored image of the original image
    """

    #Image preprocessing
    #Since there is only one image to read, mode must be increased..I can't predict
    X = np.expand_dims(x, axis=0)
    preprocessed_input = X.astype('float32') / 255.0    

    grad_model = models.Model([input_model.inputs], [input_model.get_layer(layer_name).output, input_model.output])
    
    with tf.GradientTape() as tape:
        conv_outputs, predictions = grad_model(preprocessed_input)
        class_idx = np.argmax(predictions[0])
        loss = predictions[:, class_idx]

    #Calculate the gradient
    output = conv_outputs[0]
    grads = tape.gradient(loss, conv_outputs)[0]

    gate_f = tf.cast(output > 0, 'float32')
    gate_r = tf.cast(grads > 0, 'float32')

    guided_grads = gate_f * gate_r * grads

    #Average the weights and multiply by the output of the layer
    weights = np.mean(guided_grads, axis=(0, 1))
    cam = np.dot(output, weights)

    #Scale the image to the same size as the original image
    cam = cv2.resize(cam, IMAGE_SIZE, cv2.INTER_LINEAR)
    #Instead of ReLU
    cam  = np.maximum(cam, 0)
    #Calculate heatmap
    heatmap = cam / cam.max()

    #Pseudo-color monochrome images
    jet_cam = cv2.applyColorMap(np.uint8(255.0*heatmap), cv2.COLORMAP_JET)
    #Convert to RGB
    rgb_cam = cv2.cvtColor(jet_cam, cv2.COLOR_BGR2RGB)
    #Combined with the original image
    output_image = (np.float32(rgb_cam) + x / 2)  
    
    return output_image

Calculate Grad-CAM

First, load the model and image. Please match each pass to your environment.

model_path = current_directory_path + '/model.hdf5'
image_path = current_directory_path + '/vis_images/1/2014_04_1_3.png'

model = load_model(model_path)
x = img_to_array(load_img(image_path, target_size=IMAGE_SIZE))

Check if the loaded image matches.

array_to_img(x)

Calculate Grad-CAM

target_layer = 'conv_filter5'
cam = grad_cam(model, x, target_layer)

Check the calculated image.

array_to_img(cam)

in conclusion

This time, I implemented Grad-CAM with Google Colaboratory. I hope it helps people who have suffered from the version of tensorflow.

I'm wrong here! If there is something like that, please let me know!

reference

Where in the image is deep learning looking! ?? Verify the difference between "okonomiyaki" and "pizza" on CNN Grad CAM implementation with Tensorflow 2