[PYTHON] I tried to implement Harry Potter sort hat with CNN

background

I often hear this kind of conversation. "You really look like Slytherin." "Isn't it like Gryffindor?" "What is a huffle puff?" Certainly there are some things that can be understood. I think I said in the work that the dormitory grouping in Harry Potter is important in the first place, but if the spirit affects the body, the characteristics of the grouping will also appear in the face. Isn't it? If so, there should be a feature amount for each dormitory on the face! Good learning! Motivation such as. The following is a personal subjective grouping forecast.

fig.001.jpeg fig.002.jpeg

And there is no malicious intent.

Execution environment

Purpose

When an image showing a person's face is input, a grouping neural network that returns the output of which dormitory the face grouping result is is constructed. There are four types of dormitories: Gryffindor, Raven Claw, Hufflepuff, and Slytherin, so the neural network is a four-class classifier. The conceptual diagram of the entire grouping neural network to be constructed is shown below.

fig.003.jpeg

In the next section, we will focus on data set creation and neural network model construction in learning.

Method

As mentioned for the purpose, this section describes data set creation and neural network model configuration.

Data set creation

A data set was created using the previous Image collection Python script for creating a data set used for machine learning. Collect character names and actor names belonging to each dormitory as a query. Finally, the image of the corresponding character was manually judged and stored under the directory of each dormitory. Below are some examples of collections for each dormitory.

fig.004.jpeg

The image size was 100 x 100 pixels, and the number of images in each dormitory was 50, for a total of 200 images. Obviously, there are many Westerners, and there is a great deal of bias depending on the character. Especially Raven Claw and Huffle Puff didn't get together at all ... too few ... Twenty randomly selected sheets were used as test data for learning.

Model configuration

Since the tasks are classified into 4 classes, I thought that it was not necessary to increase the scale so much, but I referred to some Alexnet. The model configuration is shown below.

model.py


class Model(Chain):
    def __init__(self):
        super(Model, self).__init__(
            conv1=L.Convolution2D(3, 128, 7, stride=1),
            bn2=L.BatchNormalization(128),
            conv3=L.Convolution2D(128, 256, 5, stride=1),
            bn4=L.BatchNormalization(256),
            conv5=L.Convolution2D(256, 384, 3, stride=1),
            bn6=L.BatchNormalization(384),
            fc7=L.Linear(6144, 8192),
            fc8=L.Linear(8192, 1024),
            fc9=L.Linear(1024, 4),
        )

    def __call__(self, x, train=True):
        h = F.max_pooling_2d(self.bn2(F.relu(self.conv1(x))), 3, stride=3)
        h = F.max_pooling_2d(self.bn4(F.relu(self.conv3(h))), 3, stride=3)
        h = F.max_pooling_2d(self.bn6(F.relu(self.conv5(h))), 2, stride=2)
        h = F.dropout(F.relu(self.fc7(h)), train=train)
        h = F.dropout(F.relu(self.fc8(h)), train=train)
        y = self.fc9(h)
        return y

class Classifier(Chain):
    def __init__(self, predictor):
        super(Classifier, self).__init__(predictor=predictor)
        self.train = True

    def __call__(self, x, t, train=True):
        y = self.predictor(x, train)
        self.loss = F.softmax_cross_entropy(y, t)
        self.acc = F.accuracy(y, t)
        return self.loss

The results of training using the above data set and model will be described in the next section.

result

This section describes the learning transition of the learner and the result of the implemented grouping neural network.

Model learning transition

There are a total of 200 data sets, of which 180 are learning data and 20 are test data. The transition of Error Rate (1-Accuracy Rate) of learning and testing when the number of epochs is 100 is shown below.

CNN_ErrorRate_epoch100.jpg

Since the lowest test_error was 0.15, it can be said that this model showed 85% accuracy in the 4-class classification.

Grouping neural network execution result

The result when the actual image is input to the grouping neural network incorporating the trained model is shown.

fig.005.jpeg fig.006.jpeg

I don't know the answer, but for the time being, I was able to confirm the desired output for the input! As expected, Hiroshi Abe was grouped into Gryffindor and Ariyoshi was grouped into Slytherin, while Gacky and Becky were grouped in exactly the opposite way. But Slytherin Gacky isn't bad either. I want to be bullied. It's a little fun, so next I searched for an image that can judge a large number of people at once and entered it.

fig.007.jpeg

When I entered the former national idol group SMAP (Sports Music Assemble People), I got this result. Even a large number of people can go at once! Moreover, the result is quite convincing! Lol I played around with it, but it's fun because I don't have to take responsibility for accuracy because the answer isn't the answer in the first place. Below are just a few simple ways to play.

how to play

Source code for grouping neural networks: GitHub In addition, the model trained this time is placed in here. As a simple usage, you can play by putting the learned model you dropped in the cloned directory and executing the following command. Also note that the face recognition part uses OpenCV's haarcascade, so create a haarcascade directory. It is necessary to put a face recognition trained model (such as haarcascade_frontalface_default.xml) directly under it. You can rewrite the path that references the model in your code.

$ python main.py -i ImagePath -m ./LearnedModel.model -p 1

Also, as a point to note when playing, if the first face recognition is successful, the grouping result will be labeled for each face, so you have to be careful to input the image of the face that gets caught in the face recognition of OpenCV. It doesn't become. The accuracy of face recognition also depends on the trained model, so you should try various things without giving up even if you make a mistake once.

Consideration

The number of learning data greatly affects the learning results, but this time the actors in each dormitory appearing in the movie were too biased, so we could not collect a satisfactory number. Even so, the fact that the correct answer rate is as high as 85% may mean that we have grasped some characteristics of each dormitory. In terms of the number of human samples, it is not strange that the same face is included in the validation, so it seems that about 85% will be taken for granted ... In addition, although AlexNet was used as a reference for the model configuration this time, the input image is assumed to be 100 x 100, so we decided that it was not necessary to make the model with such a high degree of freedom, so we reduced the convolution layer. Furthermore, the generalization performance may have improved if the kernel made the 7x7 and 5x5 layers a little deeper. I haven't tuned it so seriously, so I can still expect an improvement in accuracy if I do a random search.

Ariyoshi was somehow satisfied with Slytherin. By the way, I was Gryffindor. I did it.

Recommended Posts

I tried to implement Harry Potter sort hat with CNN
I tried to implement Autoencoder with TensorFlow
I tried to implement CVAE with PyTorch
I tried to implement reading Dataset with PyTorch
I tried to implement selection sort in python
I tried to implement PCANet
I tried to implement StarGAN (1)
I tried to implement and learn DCGAN with PyTorch
I tried to implement Minesweeper on terminal with python
I tried to implement an artificial perceptron with python
I tried to implement Grad-CAM with keras and tensorflow
I tried to implement SSD with PyTorch now (Dataset)
I tried to implement merge sort in Python with as few lines as possible
I tried to implement a volume moving average with Quantx
I tried to implement breakout (deception avoidance type) with Quantx
I tried to implement adversarial validation
I tried to implement ListNet of rank learning with Chainer
I tried to implement hierarchical clustering
I tried to sort a random FizzBuzz column with bubble sort.
I tried to implement Realness GAN
I tried to implement SSD with PyTorch now (model edition)
I tried to implement sentence classification by Self Attention with PyTorch
I tried to implement PLSA in Python
I tried to implement permutation in Python
I tried to visualize AutoEncoder with TensorFlow
I tried to get started with Hy
I tried to implement PLSA in Python 2
I tried to implement ADALINE in Python
I tried to implement PPO in Python
I tried to solve TSP with QAOA
I tried CNN fine tuning with Resnet
I tried to implement deep learning that is not deep with only NumPy
I tried to implement a blockchain that actually works with about 170 lines
765 I tried to identify the three professional families by CNN (with Chainer 2.0.0)
I tried to predict next year with AI
I tried to program bubble sort by language
I tried to detect Mario with pytorch + yolov3
I tried to use lightGBM, xgboost with Boruta
I tried to learn logical operations with TF Learn
I tried to move GAN (mnist) with keras
I tried to save the data with discord
I tried to detect motion quickly with OpenCV
I tried to integrate with Keras in TFv1.1
I tried to get CloudWatch data with Python
I tried to output LLVM IR with Python
I tried to implement TOPIC MODEL in Python
I tried to detect an object with M2Det!
I tried to automate sushi making with python
I tried to predict Titanic survival with PyCaret
I tried to operate Linux with Discord Bot
I tried to study DP with Fibonacci sequence
I tried to start Jupyter with Amazon lightsail
I tried to judge Tsundere with Naive Bayes
I tried to implement the traveling salesman problem
I tried to implement Cifar10 with SONY Deep Learning library NNabla [Nippon Hurray]
I tried to debug.
I tried to paste
I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried to implement multivariate statistical process management (MSPC)
I tried to extract features with SIFT of OpenCV
I tried to move Faster R-CNN quickly with pytorch