Nice to meet you. My name is @eve_yk and I am an engineer intern at a startup called Liaro. This time, we will deepen our understanding of the technology that Liaro uses every day, actively output and share knowledge, and hopefully have a professional person on the road throw a sharp Masakari. For that purpose, I decided to write a blog! There is not much content that can be written, but I would like to write as much as I can. Thank you very much~!

This time, I will create a face image classifier using the Convolutional Neural Network (CNN).

Purpose

The task of face classification is Facebook's DeepFace and Google's FaceNet. /1503.03832) etc. have achieved human-like (or higher?) Accuracy.

Why challenge to identify face images? To explain that, I would like you to take a look at the following image.

First, the icon I have registered on facebook.

It will be a nice smile. Next, this is Mr. Kazuhiko Tanaka of "Super Maradona", the finalist of the 2015 M-1 Grand Prix.

！？！？！？

Super-like! !! !! It was a very shocking experience for me, who had never been told to look like someone. I think I have no choice but to make a classifier that distinguishes me from Mr. Tanaka. That's a silly reason.

This time it is a binary classification, and the data set is not large, so it is a boring task, but please forgive me.

0. Build a development environment

This time, I tested it in the following environment.

python 2.7.9
numpy 1.10
opencv 2.4
chainer 1.6.0
ProgressBar2 3.6.0

If you use pyenv to build the environment, it will be easy. I think you should refer to the links below.

Building a machine learning application development environment with Python

numpy, opencv can be installed with anaconda, chainer and ProgressBar2 can be installed with pip.

1. Collect images for learning.

Next, collect the images used for learning. It is no exaggeration to say that collecting the data to be used is the most difficult thing in machine learning. In this case, ** it is especially difficult because there is too little demand for the purpose. I will do my best to collect it manually.

For the time being, 80 images of myself and 68 images of Mr. Tanaka were collected. (I was surprised that even if I caught my smartphone, PC, facebook, only this was gathered) Avoid 5 of these as test images. We will create a data set using the remaining 75,63 images as training images. The number is too small, but this time it's a play, so let's do this.

2. Preprocess the image

In order to classify faces, we will process the image a little. The next 4 steps.

Cut out the face part of the image
Resize
Data expansion (inversion, rotation)
Convert to np.ndarray format

First, cut out the area of the face part in the image. For this, we use a cascade classifier that uses the Haar-Like features provided by OpenCV. I will not learn the classifier this time. Then resize the image to 64 * 64px. Chainer can calculate features of arbitrary fixed length size regardless of the size of the input image [Spatial Pyramid Pooling](http://docs.chainer.org/en/stable/reference/functions.html#chainer.functions .spatial_pyramid_pooling_2d) is also implemented, but I haven't used it this time. After that, the data is expanded. When classifying images, the original image is processed by inversion, translation, rotation, color tone change, smoothing, etc., and the amount of data is inflated. This time, the original image is inverted and rotated to increase the amount of data.

If you code the above process, it will look like this.

`face_data_augmentation.py`


# coding:utf-8

"""
Extract the face area existing in the image in the specified folder
Invert and rotate the image to expand
"""

import os
import glob
import argparse
import cv2
import numpy as np

CASCADE_PATH = "/path/to/haarcascade/haarcascade_frontalface_alt.xml"
cascade = cv2.CascadeClassifier(CASCADE_PATH)

def detectFace(image):
	"""
Extract the face image part
	"""
	image_gray = cv2.cvtColor(image, cv2.cv.CV_BGR2GRAY)
	facerect = cascade.detectMultiScale(image_gray, scaleFactor=1.1, minNeighbors=3, minSize=(50, 50))
 
	return facerect

def resize(image):
	"""
Image resizing
	"""
	return cv2.resize(image, (64,64))

def rotate(image, r):
	"""
Rotate r degrees around the center of the image
	"""
	h, w, ch = image.shape #Image array size
	M = cv2.getRotationMatrix2D((w/2, h/2), r, 1) #Rotation matrix to rotate around the image
	rotated = cv2.warpAffine(image, M, (w, h))

	return rotated

if __name__ == "__main__":
	parser = argparse.ArgumentParser(description='clip face-image from imagefile and do data argumentation.')
	parser.add_argument('-p', required=True, help='set files path.', metavar='imagefile_path')
	args = parser.parse_args()

	#Create if there is no output directory
	result_dir = args.p + "_result"
	if not os.path.exists(result_dir):
		os.makedirs(result_dir)

	face_cnt = 0

	#Get jpg file
	files = glob.glob(args.p+"\*.jpg ")
	print args.p+"\*.jpg "

	for file_name in files:
		#Image loading
		image = cv2.imread(file_name)
		if image is None:
			#Read failure
			continue

		# -12~Rotate 3 degrees in a range of 12 degrees
		for r in xrange(-12,13,4):
			image = rotate(image, r)

			#Face image extraction
			facerect_list = detectFace(image)
			if len(facerect_list) == 0: continue

			for facerect in facerect_list:
				#Cutout of face image part
				croped = image[facerect[1]:facerect[1]+facerect[3],facerect[0]:facerect[0]+facerect[2]]

				#output
				cv2.imwrite(result_dir+"/"+str(face_cnt)+".jpg ", resize(croped))
				face_cnt += 1

				#Inverted image is also output
				fliped = np.fliplr(croped)
				cv2.imwrite(result_dir+"/"+str(face_cnt)+".jpg ", resize(fliped))
				face_cnt += 1

The images output by this code include other people's face images and falsely detected images in the photo, so we will manually remove unnecessary images one by one. Here is the completed dataset.

…。 It's a feeling. As training images, I made 393 images of myself and 187 images of Mr. Tanaka.

Finally, create the data converted to the format of np.ndarray for handling by chainer. Since it is assumed that np.ndarray type data will be input to the Variable class used in chainer, convert it to that format in advance. At this time, note that the image format handled by Python's OpenCV and the image format handled by chainer's CNN are different.

OpenCV => (height, width, channel) chainer => (channel, height, width)

Convert with the following code.

`make_dataset.py`


# coding:utf-8

import os
import sys
import argparse
import glob
import cv2
import numpy as np

"""
Create a dataset for use with CNN
Convert images to CNN input format

The format of the dataset is as follows
	- dataset
		- train
			- [class_name_1]
				- hogehoge.jpg
				- foofoo.jpg
				- ...
			- [class_name_2]
				- hogehoge.jpg
				- ...
			- ...
		- test
			- [class_name_1]
				- hogehoge.jpg
				- ...
			- ...

"""

def transpose_opencv2chainer(x):
	"""
Convert from opencv npy format to chainer npy format
		opencv  => (height, width, channel)
		chainer => (channel, height, width)
	"""
	return x.transpose(2,0,1)

if __name__ == "__main__":
	parser = argparse.ArgumentParser(description='Create dataset for CNN')
	parser.add_argument('--input_path',   required=True, type=str)
	parser.add_argument('--output_path',  required=True, type=str)
	args = parser.parse_args()

	#Get jpg file list
	train_files = glob.glob(args.input_path+"/train/*/*.jpg ")
	test_files  = glob.glob(args.input_path+"/test/*/*.jpg ")

	#Create if there is no output directory
	if not os.path.exists(args.output_path):
		os.makedirs(args.output_path)

	train_data  = []
	train_label = []
	test_data   = []
	test_label  = []
	label_dict  = {}

	#Training data creation
	for file_name in train_files:
		image = cv2.imread(file_name)
		if image is None:
			#Read failure
			continue

		#Get the class name from the directory structure
		class_name = file_name.replace("\\", "/").split("/")[-2]
		
		#Convert to chainer format
		image = transpose_opencv2chainer(image)
		train_data.append(image)
		train_label.append(label_dict.setdefault(class_name, len(label_dict.keys())))

	#Data creation / saving
	train_data  = np.array(train_data)
	train_label = np.array(train_label)
	np.save(args.output_path+"/train_data.npy" , train_data)
	np.save(args.output_path+"/train_label.npy", train_label)

	for file_name in test_files:
		image = cv2.imread(file_name)
		if image is None:
			#Read failure
			continue

		#Get the class name from the directory structure
		class_name = file_name.replace("\\", "/").split("/")[-2]
		
		#Convert to chainer format
		image = transpose_opencv2chainer(image)
		test_data.append(image)
		test_label.append(label_dict.setdefault(class_name, len(label_dict.keys())))

	#Data creation / saving
	test_data   = np.array(test_data)
	test_label  = np.array(test_label)
	np.save(args.output_path+"/test_data.npy"  , test_data)
	np.save(args.output_path+"/test_label.npy" , test_label)

It's a little short, but this time it's up to here. Next time, I would like to describe a model of the classifier and actually learn and evaluate face discrimination. looking forward to!

reference

Building a machine learning application development environment with Python --qiita

https://github.com/mitmul/chainer-cifar10

Chainer classifies CIFAR-10-A diary of a laid-back engineer

Yoshimoto Kogyo Co., Ltd. Entertainer Profile | Super Maradona

[PYTHON] There was a doppelganger, so I tried to distinguish it with artificial intelligence (laughs) (Part 1)