Introduction

About a year ago written in Chainer This is a program to classify anime faces, but this time I wrote it in Keras. The program is listed on GitHub.

data set

The dataset can be obtained from animeface-character-dataset. Reference: I tried to extract the features of the anime face with Denoising AutoEncoder

Data set preprocessing

I improved it a little more than before. Resize to 32x32 RGB data (shape = (3, 32, 32)). The difference from the last time is that it probably works without deleting the empty folder that does not contain data. Requirements

six
numpy
opencv
progressbar2

It is.

#! -*- coding: utf-8 -*-

import os
import six.moves.cPickle as pickle
import numpy as np
try:
	import cv2 as cv
except:
	pass
from progressbar import ProgressBar

class AnimeFaceDataset:
	def __init__(self):
		self.data_dir_path = u"./animeface-character-dataset/thumb/"
		self.data = None
		self.target = None
		self.n_types_target = -1
		self.dump_name = u'animedata'
		self.image_size = 32

	def get_dir_list(self):
		tmp = os.listdir(self.data_dir_path)
		if tmp is None:
			return None
		ret = []
		for x in tmp:
			if os.path.isdir(self.data_dir_path+x):
				if len(os.listdir(self.data_dir_path+x)) >= 2:
					ret.append(x)
		return sorted(ret)

	def get_class_id(self, fname):
		dir_list = self.get_dir_list()
		dir_name = filter(lambda x: x in fname, dir_list)
		return dir_list.index(dir_name[0])

	def get_class_name(self, id):
		dir_list = self.get_dir_list()
		return dir_list[id]

	def load_data_target(self):
		if os.path.exists(self.dump_name+".pkl"):
			print "load from pickle"
			self.load_dataset()
			print "done"
		else:
			dir_list = self.get_dir_list()
			ret = {}
			self.target = []
			self.data = []
			print("now loading...")
			pb = ProgressBar(min_value=0, max_value=len(dir_list)).start()
			for i, dir_name in enumerate(dir_list):
				pb.update(i)
				file_list = os.listdir(self.data_dir_path+dir_name)
				for file_name in file_list:
					root, ext = os.path.splitext(file_name)
					if ext == u'.png':
						abs_name = self.data_dir_path+dir_name+'/'+file_name
						# read class id i.e., target
						class_id = self.get_class_id(abs_name)
						self.target.append(class_id)
						# read image i.e., data
						image = cv.imread(abs_name)
						image = cv.resize(image, (self.image_size, self.image_size))
						image = image.transpose(2,0,1)
						image = image/255.
						self.data.append(image)
			pb.finish()
			print("done.")
			self.data = np.array(self.data, np.float32)
			self.target = np.array(self.target, np.int32)

			self.dump_dataset()

	def dump_dataset(self):
		pickle.dump((self.data,self.target), open(self.dump_name+".pkl", 'wb'), -1)

	def load_dataset(self):
		self.data, self.target = pickle.load(open(self.dump_name+".pkl", 'rb'))


if __name__ == '__main__':
	dataset = AnimeFaceDataset()
	dataset.load_data_target()

When I actually read it

In [1]: from animeface import AnimeFaceDataset

In [2]: dataset = AnimeFaceDataset()

In [3]: dataset.load_data_target()
load from pickle
done

In [4]: x = dataset.data

In [5]: y = dataset.target

In [6]: print x.shape, y.shape
(14490, 3, 32, 32) (14490,)

Therefore, the number of data is 14490 and the number of classes (number of characters) is 176. (Up to this point, it's almost the same as last time.)

Keras Implementation of Convolutional Neural Networks

Model building

from keras.layers.convolutional import Convolution2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dense
from keras.layers.core import Dropout
from keras.layers.core import Flatten
from keras.models import Sequential

def build_deep_cnn(num_classes=3):
	model = Sequential()

	model.add(Convolution2D(96, 3, 3, border_mode='same', input_shape=(3, 32, 32)))
	model.add(Activation('relu'))

	model.add(Convolution2D(128, 3, 3))
	model.add(Activation('relu'))
	model.add(MaxPooling2D(pool_size=(2, 2)))
	model.add(Dropout(0.5))

	model.add(Convolution2D(256, 3, 3, border_mode='same'))
	model.add(Activation('relu'))

	model.add(Convolution2D(256, 3, 3))
	model.add(Activation('relu'))
	model.add(MaxPooling2D(pool_size=(2, 2)))
	model.add(Dropout(0.5))

	model.add(Flatten())
	model.add(Dense(1024))
	model.add(Activation('relu'))
	model.add(Dropout(0.5))

	model.add(Dense(num_classes))
	model.add(Activation('softmax'))
	
	return model

You can build a network simply by first generating a sequential model with model = Sequential () and then adding Convolutional2D and Dense to it. It feels like Convolutional2D corresponds to the convolution layer and Dense corresponds to the fully connected layer. The very first,

Convolution2D(96, 3, 3, border_mode='same', input_shape=(3, 32, 32))

Only need to specify ʻinput_shape. To briefly explain Convolutional2D, the first argument specifies the number of convolution kernels, and the second and third arguments specify the size of the convolution kernel. There are two types of border_mode, sameandvalid, but in same, padding is half the size of the kernel, that is, the vertical and horizontal size of the output does not change the vertical and horizontal size of the input. .. validmeans that there is no padding, that is, the vertical and horizontal dimensions of the output are smaller than the vertical and horizontal dimensions of the input. Regarding padding, [here](https://github.com/vdumoulin/conv_arithmetic) is easy to understand. This time, theshape of the convolution kernel with sameis(96, 3, 3), and the shape of the input is (3, 32, 32) , so the shapeof the output of this layer is. It becomes (96, 32, 32) . In the case of valid, the output shapeis(96, 32- (3-1), 32- (3-1)) = (96, 30, 30). You can also set stride` etc.

Model learning

from keras.callbacks import EarlyStopping
from keras.callbacks import LearningRateScheduler
from keras.optimizers import Adam
from keras.optimizers import SGD
from animeface import AnimeFaceDataset

class Schedule(object):
	def __init__(self, init=0.01):
		self.init = init
	def __call__(self, epoch):
		lr = self.init
		for i in xrange(1, epoch+1):
			if i%5==0:
				lr *= 0.5
		return lr

def get_schedule_func(init):
	return Schedule(init)

dataset = AnimeFaceDataset()
dataset.load_data_target()
x = dataset.data
y = dataset.target
n_class = len(set(y))
perm = np.random.permutation(len(y))
x = x[perm]
y = y[perm]

model = build_deep_cnn(n_class)
model.summary()
init_learning_rate = 1e-2
opt = SGD(lr=init_learning_rate, decay=0.0, momentum=0.9, nesterov=False)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=["acc"])
early_stopping = EarlyStopping(monitor='val_loss', patience=3, verbose=0, mode='auto')
lrs = LearningRateScheduler(get_schedule_func(init_learning_rate))

hist = model.fit(x, y, 
				batch_size=128, 
				nb_epoch=50, 
				validation_split=0.1, 
				verbose=1, 
				callbacks=[early_stopping, lrs])

For the callbacks of the argument of the fit function to be learned, specify ʻEarlyStopping that automatically ends learning when convergence is judged, or LearningRateScheduler that can adjust the learning rate for each ʻepoch. It is convenient and can be done.

To briefly explain LearningRateScheduler, it takes" a function that returns the learning rate when the current number of ʻepoch` is given as an argument (starting with 0)" as an argument. For example

class Schedule(object):
    def __init__(self, init=0.01):
        self.init = init
    def __call__(self, epoch):
        lr = self.init
        for i in xrange(1, epoch+1):
            if i%5==0:
                lr *= 0.5
        return lr

def get_schedule_func(init):
    return Schedule(init)

lrs = LearningRateScheduler(get_schedule_fun(0.01))

If you do like this, the initial learning rate is 0.01 and the learning rate is halved every 5 epochs.

In addition, the error function etc. for each epoch is saved in the .history of the object of the return value of fit in a dictionary type. You can use Pandas and Matplotlib for convenient visualization.

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use("ggplot")

df = pd.DataFrame(hist.history)
df.index += 1
df.index.name = "epoch"
df[["acc", "val_acc"]].plot(linewidth=2)
plt.savefig("acc_history.pdf")
df[["loss", "val_loss"]].plot(linewidth=2)
plt.savefig("loss_history.pdf")

Experimental result

The results are as follows, and the correct answer rate for verification was less than 60%. Actually, if Adam is used as the optimization method, the correct answer rate for verification exceeds 70%, so please try it.

Transition of error function

Transition of correct answer rate

[PYTHON] Classify anime faces by sequel / deep learning with Keras