[PYTHON] Try to recognize and distinguish Shanimas character images using YOLO v3

Introduction

Do you know THE IDOLM @ STER SHINY COLORS?

If you don't know, just play Shanimas. The community and cards are great. You can watch the play video on youtube, so please watch it anyway.

By the way, if you want to train with original data, you want to do it with your favorite content, so a record of trying image recognition with Shanimas characters.

I'm not sure if the terms are correct.

environment

Learning PC

Since some work was done on mac, some screenshots were taken on mac.

Since all the software used is made by python, it can be used regardless of the OS.

Before training

I want you to introduce Darknet somehow. The specific installation method is not explained here.

If you can get the usual image, you're ready to go. テスト.png

After confirming that it works, you are ready to go!

Image collection

In order to judge the image, the image must first be prepared.

Crawler preparation and installation

Since it is very difficult to download one by one, we will use a tool called icrawler this time. This tool automatically downloads image search results, so you can easily prepare images.

Since python should be included when darknet is installed

pip install icrawler

Can be installed with.

Run

Write code to collect images.

To collect Shanimas idols: I had the impression that there were few standing pictures, so it might be a good idea to add "Character name community" etc.


from icrawler.builtin import GoogleImageCrawler

google_crawler = GoogleImageCrawler(
    feeder_threads = 1,
    parser_threads = 2,
    downloader_threads = 4,
    storage = {'root_dir': 'shiny'}
    )

filters = dict(
    size = 'large'
    )

words = ["The Idolmaster Shiny Colors","Shanimas","Mano Sakuragi","Akari Kazeno","Hachinomiya tour",
"Tsukioka Koibane","Mami Tanaka","Yuka Mitsumine","Sakuya Shirase", "Kiriko Yuya","Jinka Osaki","Kana Osaki","Chiyuki Kuwayama",
"Kaho Komiya","Juri Saijo","Rinyo Morino","Chiyoko Sonoda","Arisugawa Natsuha","Asahi Serizawa","Ai Izumi", "Yuko Mayuyu",
"L’Antica","illumination STARS","ALSTROEMERIA Shanimas","Straylight","After-school climax girls"]

max_num = 100 #Number of images collected 100 is the limit in google
#Toru Asakura", "Hinana Ichikawa","Enka Higuchi","Fukumaru Koito", "Noctil Shanimas" #Comment out because noctchill was announced the day after the image was collected

for i in range(len(words)):
    google_crawler.crawl(keyword=words[i], filters=filters, max_num=max_num, file_idx_offset=i*max_num)


Save it as a python file and execute it. Then, the / shiny folder is created and the acquired image is saved. crawl結果.png

File rework

The file downloaded this time has png and gif in addition to the extension jpg.

If this is left as it is, learning will be hindered, so delete these files. There is a workaround to convert the extension, but it is troublesome, so I dealt with it by deleting it. There will be holes in the serial numbers, but I don't care because there is no effect.

In addition, this time, the card illustration of Shanimas was also used for learning. I won't explain how to download it, but it's easy to find.

Also, if multi-byte characters are mixed, it is troublesome, so convert appropriately. Easy way to rename a large number of files with serial numbers (standard Windows function)

Annotation

When the image is ready, it is finally an annotation. It is very difficult to annotate 2700 images (actually it will be about 2500 because some of them were deleted).

Don't try to do it all at once, but divide it into several days.

Preparation of annotation tool

This time, we used labelimg to annotate.

Installation

git clone https://github.com/tzutalin/labelImg.git

sudo apt install pyqt5-dev-tools

cd labelImg

make qt5py3

Run

python3 labelimg.py

Can be executed with.

File preparation

Move the prepared image file to the labelimg folder once.

Also, modify predefined_class.txt in the class labelimg / data. By doing so, the initial value of the class is set, which makes annotation a little easier.

Enter the names of 24 people separated by line breaks and save.

Sakuragi Mano
Kazano Hiori
Hachimiya Meguru
Tanaka Mamimi
Yukoku Kiriko
Tsukioka Kogane
Mitsumine Yuika
Shirase Sakuya
Morino Rinze
Sonoda Chiyoko
Komiya Kaho
Saijo Juri
Arisugawa Natsuha
Osaki Amana
Osaki Tenka
Kuwayama Chiyuki
Serizawa Asahi
Mayuzumi Fuyuko
Izumi Mei
Ichikawa Hinana
Asakura Toru
Higuchi Madoka
Fukumaru Koito
Nanakusa Hazuki

Execution of annotation

Open the image folder with Open Dir </ b> and specify the save destination with Change Save Dir </ b>.

This time, specify the same folder. ラベリング保存場所.png Then select PscalVOC under Save and change to YOLO </ b>. If you don't change it, you will see crying, so be careful!

Annotate when the image appears. Press the "W" key to enter the range specification mode, so select the range of the idol's face. The screen for selecting the class name will be displayed. Select the correct idol and click OK.

ラベリング中.png

This is repeated forever.

ラベリング.png

Also, by turning on "View-> Auto Save Mode", it will be saved automatically. Then, most of the work can be done only by key input.

"W" → Specify range → Enter the first letter of the idle name → Select with the arrow keys → Enter → Move to the next image with "D"

Can be said.

Precautions when annotating

Since more than 2000 annotations are performed, the software sometimes crashes. At this time, the "classes.txt" file in the image folder may be damaged. If you open it with a memo pad while working, you can recover it immediately even if it is damaged.

Also, if you create a new class due to an input error, you can correct it in the same way, but be sure to delete the one labeled in the new class.

In addition, images that cannot be labeled are included due to the convenience of image collection. You can skip the image as it is.

ラベリング後.png Confirm that the image and text file have been created. ## Data division

Divide the data in the folder into teacher data and learning data.

this is How to train YOLOv2 to detect custom objects We will change some of the programs in the above to automatically separate the data.

import glob, os

# Current directory
current_dir = os.path.dirname(os.path.abspath(__file__))

# Directory where the data will reside, relative to 'darknet.exe'
path_data = 'data/obj/'

# Percentage of images to be used for the test set
percentage_test = 10

# Create and/or truncate train.txt and test.txt
file_train = open('train.txt', 'w')  
file_test = open('test.txt', 'w')

# Populate train.txt and test.txt
counter = 1  
index_test = round(100 / percentage_test)  
texts = glob.glob(os.path.basename(os.path.join(current_dir, "*.txt")))
for pathAndFilename in glob.iglob(os.path.join(current_dir, "*.jpg ")):  
    title, ext = os.path.splitext(os.path.basename(pathAndFilename))
    if (title + ".txt") in texts:
        if counter == index_test:
            counter = 1
            file_test.write(path_data + title + '.jpg' + "\n")
        else:
            file_train.write(path_data + title + '.jpg' + "\n")
            counter = counter + 1

Give it an appropriate file name, save it in the same folder as the image, and execute it. This will add only the images that have a text file to the train.txt or test.txt file. Since the percentage of teacher data is determined by percentage_test = 10, change it as appropriate.

Move the folder containing these under darknet / data / </ b> (so that it becomes darknet / data / shiny)

File preparation

Prepare the files necessary for learning.

Preparing the weights file

Since there is an official pre-trained weights file, use it.

wget https://pjreddie.com/media/files/darknet53.conv.74

You can download it at.

Preparation of configuration file

  • .data
  • .names
  • .cfg

Create a file.

Creating a .data file

Create a .data file.

classes= 24
train  = data/shiny/train.txt  
valid  = data/shiny/test.txt  
names = cfg/obj_shiny.names
backup = backup/  

Name it obj_shiny.data </ b> and save it in cfg / At the same time, create a backup folder under darknet.

Creating a .names file

Create a obj_shiny.names </ b> file so that it becomes cfg / obj_shiny.names. The contents are the same as classes.txt, so you can name classes.txt and save it anew.

Creating a .cfg file

Copy cfg / yolov3-voc.cfg and find 3 locations

  • Changed the number of classes </ b> classes to 24.
  • Set filters </ b> shortly before classes to 87, which is the value of (classes + 5) * 3.

When the change is complete, save it as cfg / yolov3_shiny.cfg </ b>.

Change the classes and filters in this part.

[convolutional]
size=1
stride=1
pad=1
filters=87
activation=linear

[yolo]
mask = 6,7,8
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=24
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1

Also, if you want to train in earnest, it is in the first [net] part.

  • batch = 1 batch = 64 </ b>
  • subdivisions = 1 subdivisions = 8 </ b>

Change to.

Depending on the performance of the GPU, a memory error may occur, so reduce batches or increase subdivisions.

training

darknet.exe detector train cfg/obj_shiny.data cfg/yolov3_shiny.cfg darknet53.conv.74

Training starts by executing. If the graph does not appear, there is a high possibility that an error has occurred, so check it.

Experimental result

When learning is over

darknet.exe detector test cfg/obj_shiny.data cfg/yolov3_shiny.cfg backup/yolov3-obj_last.weights

You can check with.

結果.png Of course, the illustrations of the trained cards can be identified. 結果2.png Toru Asakura could only prepare about 10 pictures, but he recognized them if they were standing pictures. 結果3.png Asahi Serizawa, a new illustration that had not been trained, was recognized and discriminated. 結果3.png 結果3.png I couldn't recognize the profile, but I can say that I was able to learn.

At the end

結果3.png It is a graph of the process of learning this time, but it was clearly overfitting at 50,000 times. Looking at the results, the results were convincing 5000 to 7000 times.

In addition, all Shanimas illustration cards were used for the images used for learning this time. Therefore, the first image used for learning is naturally recognized. It can be said that more experimental images should have been prepared to confirm that they were really recognized.

Reference article

How to train YOLOv2 to detect custom objects

Learning of YOLO original data

Is your order YOLO! ?? (Until you learn and run YOLO on Windows 10)

Is your order YOLO v3! ?? -Running YOLO v3 on Windows-

Reference article in the icrawer part Easy collection of image data using python library icrawler