This time, I implemented GradCAM on my own CNN model. I'm using Google Colaboratory, but I think it can be used locally or with jupyter notebook.
The CNN model will be AlexNet, which has been implemented and trained using keras.
Since the image is the one used in the actual research, it cannot be published, so please read it according to the environment if you want to refer to it.
What is Grad-CAM in the first place? Who is Where in the image is deep learning looking! ?? Verify the difference between "okonomiyaki" and "pizza" on CNN Please take a closer look at the article.
――I know there is Grad-CAM, but how do you actually implement it? People who are --People suffering from tensorflow version --People who have seen other articles but are in trouble with the error ↓ (I don't know ...)
> RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.
Check the version before moving on to implementation Put the following code in the cell and check the version
import tensorflow as tf
import keras
print('tensorflow version: ', tf.__version__)
print('keras version: ', keras.__version__)
tensorflow version: 2.3.0
keras version: 2.4.3
Let's do what we need first Mount Google Drive with the command ↓.
from google.colab import drive
drive.mount('/content/drive')
Define the path of the current directory to load the image or model. Please change the following My Drive according to your environment.
current_directory_path = '/content/drive/My Drive/Research/AlexNet/'
Import required modules
import numpy as np
import cv2
#For images
from keras.preprocessing.image import array_to_img, img_to_array, load_img
#For model loading
from keras.models import load_model
#For Grad-CAM calculation
from tensorflow.keras import models
import tensorflow as tf
Definition of constants Please change here as well according to your environment.
IMAGE_SIZE = (32, 32)
Method to calculate Grad-CAM
def grad_cam(input_model, x, layer_name):
"""
Args:
input_model(object):Model object
x(ndarray):image
layer_name(string):The name of the convolution layer
Returns:
output_image(ndarray):Colored image of the original image
"""
#Image preprocessing
#Since there is only one image to read, mode must be increased..I can't predict
X = np.expand_dims(x, axis=0)
preprocessed_input = X.astype('float32') / 255.0
grad_model = models.Model([input_model.inputs], [input_model.get_layer(layer_name).output, input_model.output])
with tf.GradientTape() as tape:
conv_outputs, predictions = grad_model(preprocessed_input)
class_idx = np.argmax(predictions[0])
loss = predictions[:, class_idx]
#Calculate the gradient
output = conv_outputs[0]
grads = tape.gradient(loss, conv_outputs)[0]
gate_f = tf.cast(output > 0, 'float32')
gate_r = tf.cast(grads > 0, 'float32')
guided_grads = gate_f * gate_r * grads
#Average the weights and multiply by the output of the layer
weights = np.mean(guided_grads, axis=(0, 1))
cam = np.dot(output, weights)
#Scale the image to the same size as the original image
cam = cv2.resize(cam, IMAGE_SIZE, cv2.INTER_LINEAR)
#Instead of ReLU
cam = np.maximum(cam, 0)
#Calculate heatmap
heatmap = cam / cam.max()
#Pseudo-color monochrome images
jet_cam = cv2.applyColorMap(np.uint8(255.0*heatmap), cv2.COLORMAP_JET)
#Convert to RGB
rgb_cam = cv2.cvtColor(jet_cam, cv2.COLOR_BGR2RGB)
#Combined with the original image
output_image = (np.float32(rgb_cam) + x / 2)
return output_image
First, load the model and image. Please match each pass to your environment.
model_path = current_directory_path + '/model.hdf5'
image_path = current_directory_path + '/vis_images/1/2014_04_1_3.png'
model = load_model(model_path)
x = img_to_array(load_img(image_path, target_size=IMAGE_SIZE))
Check if the loaded image matches.
array_to_img(x)
Calculate Grad-CAM
target_layer = 'conv_filter5'
cam = grad_cam(model, x, target_layer)
Check the calculated image.
array_to_img(cam)
This time, I implemented Grad-CAM with Google Colaboratory. I hope it helps people who have suffered from the version of tensorflow.
I'm wrong here! If there is something like that, please let me know!
Where in the image is deep learning looking! ?? Verify the difference between "okonomiyaki" and "pizza" on CNN Grad CAM implementation with Tensorflow 2
Recommended Posts