[PYTHON] real-time-Personal-estimation (new model construction)

Introduction

Environment construction two times before Last time I came with the learning edition. Next, I would like to build a model that can be used properly.

Goal of this project

Last time, I wrote about the goal of this project, and the goal was to "make a Nogizaka-chan classifier". Well, I didn't achieve it, but I knew how to make it, so for the time being, I achieved about half of the goal. So I was wondering what to do and decided for the time being. We will also develop a tool that can observe it based on the person recognized by yolo. I wrote a similar code in class a long time ago, so I'm thinking of improving it. For more information, see git. So that's the current goal of this project.

Model building

Since the previous article described how to build a model, we will build it with reference to it. To be honest, I don't really show you this article. Please note that it is no longer a record for you.

(1) Collect images of Yoda-chan. Last time, I collected 10 images of Yoda-chan and processed them to make a model. Well, I was thinking that this wouldn't work, but it wouldn't work. That's why we need materials to make a model, so we will start with the method of collecting materials.

`search.py`



from icrawler.builtin import BingImageCrawler
crawler = BingImageCrawler(storage={"root_dir": "asuka"})
crawler.crawl(keyword="The name you want to search", max_num=100)

I made annotations based on the model, but this was difficult. I will post something like a model file for the time being.

`data.yaml`


train: test1/train/images
val: test1/valid/images
nc: 2
names: ['asuka', 'yoda']

Output result

I learned about 100 sheets in 2 categories and 300 epoch. It took a lot of time, so honestly It's not realistic. Well, I made it, so I tried to infer using an image different from the training data to check if it is recognized. Click here for results As a result, it works. I wonder if this can be done with videos. I'll try it a little later.

problem

The time it takes to learn is currently taking a tremendous amount of time on a local PC (CPU). If it's the current feeling, even 130 sheets will take hours. With that in mind, I feel that using the execution on the cloud (GPU) is more efficient. (Well, it's natural) Postscript In the case of 2 categories, inference in 2 categories works, but when I put in images of other people, it just doesn't work. I'm looking for a way to deal with this. Consideration ・ Improvement of calculation speed In order to make many models and consider them, it is necessary to make many models first. Therefore, we have to consider how to make a model quickly. ・ Consider creating a model using a local machine (GPU)

next time

Next, I will write specifically about how to use this. Postscript I'd like to do something about the added part, so please let me know if you have any advice.