[PYTHON] CNN determines which college a beautiful woman is likely to be in

INTRODUCTION

The image above is from Miss Campus Ritsumeikan in 2014. Everyone is very beautiful.

On the other hand, at first glance, everyone seems to have a similar face. Do kind call friends? We will call this __Ritsumeikan-like face __.

Also, I often hear words like "Seigaku-ish" and "Let's go to Gakushuin", but this is also because there are __ Seigaku-like faces __ and __ Gakushuin-like faces __. I think so.

Therefore, this time, I tried to deep learn the facial tendency of each university and created a model that can determine which university a beautiful woman is likely to be in.

APPROACH

1. Image collection of women by university

First, get the images of women from each university. The Miss Contest Portal Site has a systematic collection of photos of past Miss Cons from each university, so I used them.

photo_collector.py


# -*- coding:utf-8 -*-

import os
import bs4
import time
import random
import urllib2
from itertools import chain
from urllib import urlretrieve

base_url = 'http://misscolle.com'


def fetch_page_urls():
    html = urllib2.urlopen('{}/versions'.format(base_url))
    soup = bs4.BeautifulSoup(html, 'html.parser')

    columns = soup.find_all('ul', class_='columns')
    atags = map(lambda column: column.find_all('a'), columns)

    with open('page_urls.txt', 'w') as f:
        for _ in chain.from_iterable(atags):
            path = _.get('href')
            if not path.startswith('http'):  # Relative path
                path = '{}{}'.format(base_url, path)
            if path[-1] == '/':  # Normalize
                path = path[:-1]
            f.write('{}\n'.format(path))


def fetch_photos():
    with open('page_urls.txt') as f:
        for url in f:
            # Make directories for saving images
            dirpath = 'photos/{}'.format(url.strip().split('/')[-1])
            os.makedirs(dirpath)

            html = urllib2.urlopen('{}/photo'.format(url.strip()))
            soup = bs4.BeautifulSoup(html, 'html.parser')

            photos = soup.find_all('li', class_='photo')
            paths = map(lambda path: path.find('a').get('href'), photos)

            for path in paths:
                filename = '_'.join(path.split('?')[0].split('/')[-2:])
                filepath = '{}/{}'.format(dirpath, filename)
                # Download image file
                urlretrieve('{}{}'.format(base_url, path), filepath)
                # Add random waiting time (4 - 6 sec)
                time.sleep(4 + random.randint(0, 2))


if __name__ == '__main__':
    fetch_page_urls()
    fetch_photos()

When executed, it will be saved in the following directory structure.

http://misscolle.com/img/contests/aoyama2015/1/1.jpg is mapped to photos / aoyama / 2015 / 1_1.jpg.

photos/
├── aoyama
│   ├── 2008
│   │   ├── 1_1.jpg
│   │   ├── 1_2.jpg
│   │   ├── ...
│   │   ├── 2_1.jpg
│   │   ├── 2_2.jpg
│   │   ├── ...
│   │   └── 6_9.jpg
│   ├── 2009
│   │   ├── 1_1.jpg
│   │   ├── 1_2.jpg
│   │   ├── ...

As a result, a total of 10,725 images (about 2.5G) from 82 universities were collected. It makes me happy to see it.

スクリーンショット 2017-03-20 2.22.54.png

2. Trimming the face area with OpenCV

Next, the face area is detected by OpenCV from these images and trimmed. The evaluator used was frontalface_alt2.

face_detecter.py


# -*- coding:utf-8 -*-

import os
import cv2

def main():
    for srcpath, _, files in os.walk('photos'):
        if len(_):
            continue
        dstpath = srcpath.replace('photos', 'faces')
        os.makedirs(dstpath)
        for filename in files:
            if filename.startswith('.'):  # Pass .DS_Store
                continue
            try:
                detect_faces(srcpath, dstpath, filename)
            except:
                continue


def detect_faces(srcpath, dstpath, filename):
    cascade = cv2.CascadeClassifier('haarcascade_frontalface_alt2.xml')
    image = cv2.imread('{}/{}'.format(srcpath, filename))
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    faces = cascade.detectMultiScale(gray_image)
    # Extract when just one face is detected
    if (len(faces) == 1):
        (x, y, w, h) = faces[0]
        image = image[y:y+h, x:x+w]
        image = cv2.resize(image, (100, 100))
        cv2.imwrite('{}/{}'.format(dstpath, filename), image)


if __name__ == '__main__':
    main()

When this is executed, the face image will be saved in faces / with the same directory structure as before. I'm happy to see this too (ry

スクリーンショット 2017-03-20 19.48.52.png

3. Call CNN with Tensorflow

Finally, let's learn with CNN. First, we screened universities based on the following criteria.

--There is data on beauty pageants for the past 5 years or more. --There are more than 100 images in total.

Then, from the 20 universities squeezed by this screening, we decided to select the following 10 universities at our own discretion and classify them into 10 classes.

--Aoyama Gakuin University --Jissen Women's University

The test data for each university and the latest year is used, and the other data is used as training data. The training data and test data were 1,700 and 154, respectively.

Build a CNN with Tensorflow and learn.

cnn.py


# -*- coding:utf-8 -*-

import os
import random
import numpy as np
import tensorflow as tf

label_dict = {
    'aoyama': 0, 'jissen': 1, 'keio': 2, 'phoenix': 3, 'rika': 4,
    'rikkyo': 5, 'seikei': 6, 'sophia': 7, 'todai': 8, 'tonjo': 9
}


def load_data(data_type):
    filenames, images, labels = [], [], []
    walk = filter(lambda _: not len(_[1]) and data_type in _[0], os.walk('faces'))
    for root, dirs, files in walk:
        filenames += ['{}/{}'.format(root, _) for _ in files if not _.startswith('.')]
    # Shuffle files
    random.shuffle(filenames)
    # Read, resize, and reshape images
    images = map(lambda _: tf.image.decode_jpeg(tf.read_file(_), channels=3), filenames)
    images = map(lambda _: tf.image.resize_images(_, [32, 32]), images)
    images = map(lambda _: tf.reshape(_, [-1]), images)
    for filename in filenames:
        label = np.zeros(10)
        for k, v in label_dict.iteritems():
            if k in filename:
                label[v] = 1.
        labels.append(label)

    return images, labels


def get_batch_list(l, batch_size):
    # [1, 2, 3, 4, 5,...] -> [[1, 2, 3], [4, 5,..]]
    return [np.asarray(l[_:_+batch_size]) for _ in range(0, len(l), batch_size)]


def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)


def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)


def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')


def inference(images_placeholder, keep_prob):
    # Convolution layer
    x_image = tf.reshape(images_placeholder, [-1, 32, 32, 3])
    W_conv1 = weight_variable([5, 5, 3, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

    # Pooling layer
    h_pool1 = max_pool_2x2(h_conv1)

    # Convolution layer
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    # Pooling layer
    h_pool2 = max_pool_2x2(h_conv2)

    # Full connected layer
    W_fc1 = weight_variable([8 * 8 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 8 * 8 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

    # Dropout
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    # Full connected layer
    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    return tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)


def main():
    with tf.Graph().as_default():
        train_images, train_labels = load_data('train')
        test_images, test_labels = load_data('test')
        x = tf.placeholder('float', shape=[None, 32 * 32 * 3])  # 32 * 32, 3 channels
        y_ = tf.placeholder('float', shape=[None, 10])  # 10 classes
        keep_prob = tf.placeholder('float')

        y_conv = inference(x, keep_prob)
        # Loss function
        cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
        tf.summary.scalar('cross_entropy', cross_entropy)
        # Minimize cross entropy by using SGD
        train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
        # Accuracy
        correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
        tf.summary.scalar('accuracy', accuracy)

        saver = tf.train.Saver()
        sess = tf.InteractiveSession()
        sess.run(tf.global_variables_initializer())

        summary_op = tf.summary.merge_all()
        summary_writer = tf.summary.FileWriter('./logs', sess.graph)

        batched_train_images = get_batch_list(train_images, 25)
        batched_train_labels = get_batch_list(train_labels, 25)

        train_images = map(lambda _: sess.run(_).astype(np.float32) / 255.0, np.asarray(train_images))
        test_images = map(lambda _: sess.run(_).astype(np.float32) / 255.0, np.asarray(test_images))
        train_labels, test_labels = np.asarray(train_labels), np.asarray(test_labels)

        # Train
        for step, (images, labels) in enumerate(zip(batched_train_images, batched_train_labels)):
            images = map(lambda _: sess.run(_).astype(np.float32) / 255.0, images)
            sess.run(train_step, feed_dict={ x: images, y_: labels, keep_prob: 0.5 })
            train_accuracy = accuracy.eval(feed_dict = {
                x: train_images, y_: train_labels, keep_prob: 1.0 })
            print 'step {}, training accuracy {}'.format(step, train_accuracy)
            summary_str = sess.run(summary_op, feed_dict={
                x: train_images, y_: train_labels, keep_prob: 1.0 })
            summary_writer.add_summary(summary_str, step)
        # Test trained model
        test_accuracy = accuracy.eval(feed_dict = {
            x: test_images, y_: test_labels, keep_prob: 1.0 })
        print 'test accuracy {}'.format(test_accuracy)
        # Save model
        save_path = saver.save(sess, "model.ckpt")


if __name__ == '__main__':
    main()

Due to the lack of training data, the precision rate was about 20%, and learning did not progress much. However, the accuracy of the test data is 19.48% (the base is 10% because it is classified into 10 classes), which means that 1 in 5 people is assigned, so from this learning result, it is said that a reasonable tendency was suppressed. Is that something ...

test.png

EXPERIMENTAL RESULTS

Using this learning model, I will try to determine which university Gacky looks like from the image below. Replaced main in cnn.py with the following.

2958_original2.jpg

cnn.py


def main():
    with tf.Graph().as_default():
        test_images, test_labels = load_data('experiment')
        x = tf.placeholder('float', shape=[None, 32 * 32 * 3])  # 32 * 32, 3 channels
        y_ = tf.placeholder('float', shape=[None, 10])  # 10 classes
        keep_prob = tf.placeholder('float')

        y_conv = inference(x, keep_prob)

        sess = tf.InteractiveSession()
        sess.run(tf.global_variables_initializer())
        saver = tf.train.Saver()
        saver.restore(sess, "./model.ckpt")

        test_images = map(lambda _: sess.run(_).astype(np.float32) / 255.0, np.asarray(test_images))

        print y_conv.eval(feed_dict={ x: [train_images[0]], keep_prob: 1.0 })[0]
        print np.argmax(y_conv.eval(feed_dict={ x: [train_images[0]], keep_prob: 1.0 })[0])

The following is the execution result. The result was that Gacky had a blue-eyed face (34.83%). I feel like I know what it is. After Seigaku, Nihon University and Rikkyo continued.

#Corresponds to the following
#
# label_dict = {
#     'aoyama': 0, 'jissen': 1, 'keio': 2, 'phoenix': 3, 'rika': 4,
#     'rikkyo': 5, 'seikei': 6, 'sophia': 7, 'todai': 8, 'tonjo': 9
# }
[ 0.34834844  0.0005422   0.00995418  0.21047674  0.13970862  0.15559362 0.03095848  0.09672297  0.00721581  0.00047894]
# argmax
0

RELATED WORK

-Deep learning with TensorFlow to identify the face of an idol --Sugyan memo

The other day announced in Singapore, I thought it was amazing.

-Deep learning to determine if you have big breasts from your face photo (it works or not) --Qiita

Can you identify with high accuracy the app ChiChi that you developed before (which did not pass Apple's examination) and you can see the number of cups just by holding your smartphone in your chest? I want to compete.

FUTURE WORK

At each university, the results suggest that there is a slight tendency in the face, but the accuracy is still low, so I would like to improve the accuracy by increasing the data and tuning the model. .. I built a fairly simple model this time, but if you look at the code in cifar10 in the Tensorflow tutorial, I think there are scattered elements that can be tuned.

In addition, as a horizontal development, let's learn the images of beautiful women in each prefecture of Bijin Clock, and show which prefecture-like face a beautiful woman has. I think it's also interesting.

Recommended Posts

CNN determines which college a beautiful woman is likely to be in
[Python] tkinter Code that is likely to be reused
[Python] pandas Code that is likely to be reused
How to trick and use a terrible library that is supposed to be kept globally in flask
A program that determines whether a number entered in Python is a prime number
[Ln] How to paste a symbolic link in a directory is complicated
[Python] When the priority is the same in Priority Queue, it can be acquired in the order in which it was added to the queue.