[PYTHON] Can AI distinguish between Carlos Ghosn and Mr. Bean (face recognition using face landmarks)?

Introduction

The news that Carlos Ghosn, the former chairman of Nissan Motor Co., Ltd., broke the bail conditions and traveled to Lebanon is a hot topic. I've been thinking for a long time, but it's similar to Mr. Bean ... images.jpeg

Left: Carlos Ghosn Right: [Mr. Bean (Rowan Atkinson)](https://www.google.com/search?q=rowan+atkinson&biw=1386&bih=710&sxsrf=ACYBGNTsdddnLF7jgPr9RTePeEMGgREpyA:1577813660092&source=lnms&tbm=isch&sa=lnms&tbm=isch&saX

Since it was a holiday during the year-end and New Year holidays, I made a small classifier using a neural network. If you google, a lot of images of the two people will come out, so CNN was fine, but This time, as another approach, I would like to classify based on the position of landmarks on the face.

Please note that the coding is rough.

I'm posting the source code and the image data used on GitHub. face_identification

environment

Process flow

  1. Face detection using OpenCV cascade
  2. Estimate 68 landmarks using dlib based on the detected facial area
  3. Enter the position of 68 points into the classifier
  4. Carlos if the output is 0, Mr. Bean if the output is 1.

Creating a dataset

Prepare 10 photos of Carlos and 10 of Bean's face. I don't mind if there are only 20 images, so I will resize and inflate each image.

Use opencv cascade for face recognition and dlib for landmark estimation.

dataset_generator.py


#!/usr/bin/env python
#coding:utf-8
import cv2
import dlib
import numpy as np

cascade_path = "~/face_identification/model/haarcascade_frontalface_alt.xml"
cascade = cv2.CascadeClassifier(cascade_path)

model_path = "~/face_identification/model/shape_predictor_68_face_landmarks.dat"
predictor = dlib.shape_predictor(model_path)
detector = dlib.get_frontal_face_detector()

image_file_dir = "~/face_identification/images/carlos/"
#image_file_dir = "~/face_identification/images/rowan/"

save_file_path = "~/face_identification/dataset/carlos.csv"	
#save_file_path = "~/face_identification/dataset/rowan.csv"	

face_landmarks = []
for n in range(10):
	#image_file_name = "carlos"+str(n)+".jpeg "
	image_file_name = "rowan"+str(n)+".jpeg "
	
	raw_img = cv2.imread(image_file_dir+image_file_name)
	original_width, original_height = raw_img.shape[:2]
	multiple_list = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
	for m in multiple_list:	
		size = (int(original_height*m), int(original_width*m))
		img = cv2.resize(raw_img, size)

		gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
		faces = cascade.detectMultiScale(gray_img)

		if len(faces) != 0:
			for(x, y, width, height) in faces:
				cv2.rectangle(img, (x, y), (x+width, y+height), (0, 0, 255), 1)
				rects = detector(gray_img, 1)
				landmarks = []
				for rect in rects:
					landmarks.append(np.array([[p.x, p.y] for p in predictor(gray_img, rect).parts()]))
	
				for landmark in landmarks:	
					face_landmark = []
					for i in range(len(landmark)):
						cv2.drawMarker(img, (landmark[i][0], landmark[i][1]), (21, 255, 12))
                                                #Normalize coordinate position landmark_x = (landmark[i][0]-x)*100.00/width
						landmark_y = (landmark[i][1]-y)*100.00/height
						face_landmark.append(landmark_x)
						face_landmark.append(landmark_y)
					face_landmarks.append(np.array(face_landmark).flatten())
					

		cv2.imshow("viewer", img)
		key = cv2.waitKey(100)
	
print "finish"
np_dataset = np.array(face_landmarks)
np.savetxt(save_file_path, np_dataset)

Output the landmarks of Carlos and Ghosn's faces to a csv file. (Replace the commented out part of dataset_generator.py and execute it twice.)

68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f3237333436322f66613562353639632d613130662d396462382d396566632d6562376132333232633230372e706e67.png

You can get 68 landmarks on your face. The x and y coordinates of each point are stored in an array in order (therefore, the value that can be taken from one face is 68 x 2 = 136). Also, since the coordinate value changes greatly depending on the size of the face, it is normalized. The understanding of Coco and others is still subtle. (Please tell me if there is a good way)

Network configuration

Create a simple feed forward type neural network with Keras. The middle layer is a simple configuration with three.

network_model.py


#!/usr/bin/env python
#coding:utf-8
from keras.models import Sequential
from keras.layers import Activation, Dense, Dropout

class DNNModel():
    def __init__(self):
        self.model = Sequential()
        self.model.add(Dense(1024, input_dim=136)) 
        self.model.add(Activation('relu'))
        self.model.add(Dropout(0.1))

        self.model.add(Dense(512))
        self.model.add(Activation('relu'))
        self.model.add(Dropout(0.1))

        self.model.add(Dense(256))
        self.model.add(Activation('relu'))
        self.model.add(Dropout(0.1))

        self.model.add(Dense(2))#Match the number of correct labels
        self.model.add(Activation('softmax'))

Learning

train.py


#!/usr/bin/env python
#coding:utf-8
import numpy as np
import keras

from network_model import DNNModel
from keras.optimizers import RMSprop, SGD, Adam
from keras.utils import to_categorical
from keras.utils import np_utils

carlos_data_path = "~/face_identification/dataset/carlos.csv"
rowan_data_path = "~/face_identification/dataset/rowan.csv"

weight_file_path = "~/face_identification/model/weight.hdf5"

landmarks = []
labels = []

with open(carlos_data_path, "r") as f:
	carlos_lines = f.read().split("\n")	
	f.close()

with open(rowan_data_path, "r") as f:
	rowan_lines = f.read().split("\n")
	f.close()

for i in range(len(carlos_lines)-1):
	carlos_line = carlos_lines[i].split(" ")
	landmarks.append(np.array(carlos_line).flatten())
	labels.append(0) #Carlos is 0

for i in range(len(rowan_lines)-1):
	rowan_line = rowan_lines[i].split(" ")
	landmarks.append(np.array(rowan_line).flatten())
	labels.append(1) #Mr.Bean is 1

landmarks = np.asarray(landmarks).astype("float32")
labels = np_utils.to_categorical(labels, 2)

model = DNNModel().model
model.summary()
model.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0001), metrics=['accuracy'])

history = model.fit(landmarks, labels,
    batch_size=64,
    epochs=3000)

model.save_weights(weight_file_path)
print "model was saved."

I think the learning will be completed in less than 5 minutes.

result

result0.jpeg Correct answer result2.jpeg Correct answer

result4.jpeg Correct answer

result_.jpeg Correct answer

result1.jpeg Failure Carlos fails to detect his face. Mr. Bean failed to classify.

result3.jpeg Failure

test.py


#!/usr/bin/env python
#coding:utf-8
import cv2
import dlib
import numpy as np
import tensorflow as tf

from network_model import DNNModel

cascade_path = "~/face_identification/model/haarcascade_frontalface_alt.xml"
cascade = cv2.CascadeClassifier(cascade_path)

model_path = "~/face_identification/model/shape_predictor_68_face_landmarks.dat"
predictor = dlib.shape_predictor(model_path)
detector = dlib.get_frontal_face_detector()

trained_model_path = "~/face_identification/model/weight.hdf5"
model = DNNModel().model    
model.load_weights(trained_model_path)
graph = tf.get_default_graph()

test_image_path = "~/face_identification/images/test.jpeg "
result_image_path = "~/face_identification/images/result.jpeg "

img = cv2.imread(test_image_path)
gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
faces = cascade.detectMultiScale(gray_img, minSize=(30, 30))

if len(faces) != 0:
	for(x, y, width, height) in faces:
		cv2.rectangle(img, (x, y), (x+width, y+height), (0, 0, 255), 1)
		rects = detector(gray_img, 1)
		landmarks = []
		for rect in rects:
			landmarks.append(np.array([[p.x, p.y] for p in predictor(gray_img, rect).parts()]))

		for landmark in landmarks:
			input_data = []	
			face_landmark = []
			for i in range(len(landmark)):
				landmark_x = (landmark[i][0]-x)*100.00/width
				landmark_y = (landmark[i][1]-y)*100.00/height
				face_landmark.append(landmark_x)
				face_landmark.append(landmark_y)
			
			face_landmark = np.array(face_landmark).flatten()			
			input_data.append(face_landmark)
			with graph.as_default():
				pred = model.predict(np.array(input_data))
			
			result_idx = np.argmax(pred[0])
			if result_idx == 0:
				text = "Carlos:" + str(pred[0][result_idx])
			else:
				text = "Rowan:" + str(pred[0][result_idx])
			
		#Writing characters
		cv2.putText(img, text, (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5,(0,0,255))

			
#cv2.imshow("viewer", img)
cv2.imwrite(result_image_path, img)

At the end

Carlos Ghosn is tough.

Recommended Posts

Can AI distinguish between Carlos Ghosn and Mr. Bean (face recognition using face landmarks)?
Similar face image detection using face recognition and PCA and K-means clustering