Introduction

I made an application that automatically classifies the images of the recommended characters into distinguishing folders.

This time, I am using a machine learning library called pytorch and an API called tweepy that can learn information on twitter.

Reason for creating

It is difficult to see the entire twitter timeline, so I want to save only the image of the recommended character I wanted to make something using tweeypy and machine learning The transfer learning tutorial of pytorch was wonderful, so I wanted to make use of it. I wanted to experience the problems that arise when actually making a machine learning app.

What i did

--Pytorch transfer learning --Algorithm change and oputuma hyperparameter tuning --Changes around the data --Visualize where you are visualizing the image with cam --Implementation in the application

Modification of pytorch transfer learning tutorial

Pre-trained models are pre-built in pytorch for easy use. In the pytorch tutorial, I used it and changed it to adapt to the task based on the method of performing transfer learning. In this task, assuming that there is a real image and a 2D image on twitter, and that the 2D image has a structure of an image of the target character and an image that is not, first the animation and the reality I made a model to identify the image, and then made a model to identify the two-dimensional image again.

Specifically, as the first method used, the fully connected layer of trained res-net18 was changed to a connected layer of a size adapted to the class, and the model was given an image of the task and retrained.

From the beginning, the discrimination rate between real and non-realistic images was very high, exceeding about 95% or more, but the discrimination rate of a specific character was not as high as 70%, and we tried to improve it by various methods.

Algorithm and model changes

Easy change

Model change

The first thing I tried was to change the model. There are some trained models in pytorch, so I tried using alex-net among them. However, the accuracy did not improve by simply changing it, so I returned to resnet.

Changed the optimization algorithm

At the beginning, we were training with SGD as an optimization algorithm, but we changed this and decided to use Adam. However, it was not accurate at all, and Adam heard that hyperparameters are important, so he decided to use optuma, a hyperparameter search library.

optuna oputuna is a library that brings numerical values from the distribution for the hyperparameters you want to optimize. The specific procedure is to first define the objective function that contains the numerical value and high para that you want to optimize.

`objective function`


def objective(trial):

    lr =trial.suggest_loguniform('lr', 1e-6, 1e-4)
    beta1 =trial.suggest_uniform('beta1', 0.8, 0.95)
    beta2 =trial.suggest_uniform('beta2', 0.9, 0.99)
    eps =trial.suggest_loguniform('eps', 1e-9,1e-7)
    gamma = trial.suggest_loguniform('gamma',0.05,0.2)

    model_ft = models.resnet18(pretrained=True)
    num_ftrs = model_ft.fc.in_features
    model_ft.fc = nn.Linear(num_ftrs, 2)
    model_ft = model_ft.to(device)

    criterion = nn.CrossEntropyLoss()

    optimizer_ft = optim.Adam(model_ft.parameters(), lr=lr, betas=(beta1, beta2), eps=eps, weight_decay=0, amsgrad=False)

    exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=gamma)
    model_ft,best_loss,best_acc = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler)
    return 1 - best_acc

The caveat here is that when you return the number you want to optimize, you have to define it as a minimization problem. This time we want to maximize the precision, so we subtract the precision from 1 and return the value. It seems that there are various other functions, but this time I learned easily.

`study`


study = optuna.create_study()
study.optimize(objective, n_trials=2)

After defining the objective function, create a study object, pass the objective function and the number of trials to it, and it will automatically tune the hyperparameters.

When I tuned Adam's hyperparameters here, there was a good improvement in accuracy. 77% on resnet There was a 75% improvement on alex-net.

Data change

Here, I decided to take an approach such as increasing the amount of data and changing the resolution.

Data growth

First, when we doubled the amount of data, the accuracy improved to about 83%. After that, when I improved the resolution, the accuracy was about 90%.

Validation of judgment results by Guided-gradcam

What is Guided-gradcam?

I won't go into detail here, but in a nutshell, it's a technology that visualizes where CNN looks and makes decisions, taking into account the impact of each class.

Implementation

https://github.com/kazuto1011/grad-cam-pytorch A method such as gradcam is implemented here so that it can be used with pytorch, and I implemented it with reference to the demo of this.

result

image Image when judging as F Image when judging as T

Implementation in application

tweepy A library that wraps the api provided by twitter, so you can learn the timeline and follow users. However, as a caveat

――You can only request data a fixed number of times for each hour. --You need to issue the authentication key https://developer.twitter.com/.

It has become. The flow of this process is to first learn the tweets on the timeline and then determine which one has the image.

`Processing to determine if a URL has an image`


import tweepy
import keys
import classifierImage
def main():
    auth = tweepy.OAuthHandler(keys.consumer_key,keys.consumer_secret)
    auth.set_access_token(keys.access_token,keys.access_token_secret)

    api = tweepy.API(auth)

    public_tweets = api.home_timeline(count=200)
    urls = []
    for tweet in public_tweets:
        if 'media' in tweet.entities:
            for media in tweet.entities['media']:
                #print(media['media_url'])
                urls.append(media['media_url'])
    classifierImage.downloadImage(urls)
            

if __name__=='__main__':
    main()

Next, call the saved trained model. Let's learn the image data from the argument url and convert the binary data to the PIL Image object. Preprocess it with torchvision, judge it with the model, and decide the download destination.

`A program that judges and saves images`


import urllib.request
import torch
import numpy as np
import io
from PIL import Image
import cv2
from torchvision import transforms,models
import torch.nn as nn

def downloadImage(imageUrls):
    
    tgts=[]
    for url in imageUrls:
        filename = url.split('/')[-1]
        tgt = urllib.request.urlopen(url).read()
        tgts.append((tgt,filename))
    
    toTe = transforms.ToTensor()
    nor = transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5])
    
    co = transforms.Compose([
        transforms.Resize(512),
        transforms.CenterCrop(448),
        transforms.ToTensor(),
        transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5])
        ])
    
    #tensors = torch.cat(tgs,dim=0)
    
    classfier_anime_model = models.resnet18(pretrained=True)
    num_ftrs = classfier_anime_model.fc.in_features

    classfier_anime_model.fc = nn.Linear(num_ftrs, 2)
    classfier_anime_model.load_state_dict(torch.load('classfierAnime'))
    classfier_anime_model.eval()

    classfier_Koishi_model = models.resnet18(pretrained=True)
    num_ftrs_k = classfier_Koishi_model.fc.in_features

    classfier_Koishi_model.fc = nn.Linear(num_ftrs_k, 2)
    classfier_Koishi_model.load_state_dict(torch.load('classfierKoishi1'))
    classfier_Koishi_model.eval()

    with torch.no_grad():
        for i,tg in enumerate(tgts):
            t = Image.open(io.BytesIO(tg[0])).convert('RGB')
            print(t)
            t = co(t)
            t = t.unsqueeze(0)
            
            out = classfier_anime_model(t)
            _,preds = torch.max(out,1)
            print(out)

            if preds[0]==1:
                out_k = classfier_Koishi_model(t)
                _,preds_k = torch.max(out_k,1)
                if preds_k[0]==0:
                    with open("img/"+tg[1], mode='wb') as f:
                        f.write(tg[0])
                else:
                    with open("imgKoishi/"+tg[1], mode='wb') as f:
                        f.write(tg[0])

            else:
                with open("realImg/"+tg[1], mode='wb') as f:
                    f.write(tg[0])

The result of actually using the app

Knowledge obtained by actually moving it

The identification rate of whether it was a real image was very good and satisfactory. There was almost no oversight in determining whether there was a specific character, but many of the items that were determined to be True did not meet the conditions. (Recall is expensive, but Pressision is that much) It is probable that the cause of this is that the saved images are biased and the identification of images that do not meet the conditions that were not saved was not successful. Another problem that occurred was that the image of the target character did not appear on the timeline at the beginning, so I was confused because I could not judge whether it could be identified. Through such actual operation, it was good to experience that data that was not learned hinders identification.

What I want to do in the future

――I want to be able to save it regularly, but I can't leave my home computer on, so I want to be able to process it with a Raspberry Pi. --I want to be able to update the model automatically using the saved image. ――I didn't save the changes such as the model properly, so I want to save the data firmly so that I can leave it in the article. ――Learning is unstable, so I want to analyze it theoretically.

Referenced site

https://www.slideshare.net/takahirokubo7792/ss-71453093 https://qiita.com/koki-sato/items/c16c8e3445287698b3a8 http://docs.tweepy.org/en/v3.5.0/index.html https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html https://qiita.com/enmaru/items/2770df602dd7778d4ce6

[PYTHON] I made a twitter app that identifies and saves the image of a specific character on the twitter timeline by pytorch transfer learning