I made an application that automatically classifies the images of the recommended characters into distinguishing folders.
This time, I am using a machine learning library called pytorch and an API called tweepy that can learn information on twitter.
It is difficult to see the entire twitter timeline, so I want to save only the image of the recommended character I wanted to make something using tweeypy and machine learning The transfer learning tutorial of pytorch was wonderful, so I wanted to make use of it. I wanted to experience the problems that arise when actually making a machine learning app.
--Pytorch transfer learning --Algorithm change and oputuma hyperparameter tuning --Changes around the data --Visualize where you are visualizing the image with cam --Implementation in the application
Pre-trained models are pre-built in pytorch for easy use. In the pytorch tutorial, I used it and changed it to adapt to the task based on the method of performing transfer learning. In this task, assuming that there is a real image and a 2D image on twitter, and that the 2D image has a structure of an image of the target character and an image that is not, first the animation and the reality I made a model to identify the image, and then made a model to identify the two-dimensional image again.
Specifically, as the first method used, the fully connected layer of trained res-net18 was changed to a connected layer of a size adapted to the class, and the model was given an image of the task and retrained.
From the beginning, the discrimination rate between real and non-realistic images was very high, exceeding about 95% or more, but the discrimination rate of a specific character was not as high as 70%, and we tried to improve it by various methods.
The first thing I tried was to change the model. There are some trained models in pytorch, so I tried using alex-net among them. However, the accuracy did not improve by simply changing it, so I returned to resnet.
At the beginning, we were training with SGD as an optimization algorithm, but we changed this and decided to use Adam. However, it was not accurate at all, and Adam heard that hyperparameters are important, so he decided to use optuma, a hyperparameter search library.
optuna oputuna is a library that brings numerical values from the distribution for the hyperparameters you want to optimize. The specific procedure is to first define the objective function that contains the numerical value and high para that you want to optimize.
objective function
def objective(trial):
lr =trial.suggest_loguniform('lr', 1e-6, 1e-4)
beta1 =trial.suggest_uniform('beta1', 0.8, 0.95)
beta2 =trial.suggest_uniform('beta2', 0.9, 0.99)
eps =trial.suggest_loguniform('eps', 1e-9,1e-7)
gamma = trial.suggest_loguniform('gamma',0.05,0.2)
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 2)
model_ft = model_ft.to(device)
criterion = nn.CrossEntropyLoss()
optimizer_ft = optim.Adam(model_ft.parameters(), lr=lr, betas=(beta1, beta2), eps=eps, weight_decay=0, amsgrad=False)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=gamma)
model_ft,best_loss,best_acc = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler)
return 1 - best_acc
The caveat here is that when you return the number you want to optimize, you have to define it as a minimization problem. This time we want to maximize the precision, so we subtract the precision from 1 and return the value. It seems that there are various other functions, but this time I learned easily.
study
study = optuna.create_study()
study.optimize(objective, n_trials=2)
After defining the objective function, create a study object, pass the objective function and the number of trials to it, and it will automatically tune the hyperparameters.
When I tuned Adam's hyperparameters here, there was a good improvement in accuracy. 77% on resnet There was a 75% improvement on alex-net.
Here, I decided to take an approach such as increasing the amount of data and changing the resolution.
First, when we doubled the amount of data, the accuracy improved to about 83%. After that, when I improved the resolution, the accuracy was about 90%.
I won't go into detail here, but in a nutshell, it's a technology that visualizes where CNN looks and makes decisions, taking into account the impact of each class.
https://github.com/kazuto1011/grad-cam-pytorch A method such as gradcam is implemented here so that it can be used with pytorch, and I implemented it with reference to the demo of this.
image Image when judging as F Image when judging as T
tweepy A library that wraps the api provided by twitter, so you can learn the timeline and follow users. However, as a caveat
――You can only request data a fixed number of times for each hour. --You need to issue the authentication key https://developer.twitter.com/.
It has become. The flow of this process is to first learn the tweets on the timeline and then determine which one has the image.
Processing to determine if a URL has an image
import tweepy
import keys
import classifierImage
def main():
auth = tweepy.OAuthHandler(keys.consumer_key,keys.consumer_secret)
auth.set_access_token(keys.access_token,keys.access_token_secret)
api = tweepy.API(auth)
public_tweets = api.home_timeline(count=200)
urls = []
for tweet in public_tweets:
if 'media' in tweet.entities:
for media in tweet.entities['media']:
#print(media['media_url'])
urls.append(media['media_url'])
classifierImage.downloadImage(urls)
if __name__=='__main__':
main()
Next, call the saved trained model. Let's learn the image data from the argument url and convert the binary data to the PIL Image object. Preprocess it with torchvision, judge it with the model, and decide the download destination.
A program that judges and saves images
import urllib.request
import torch
import numpy as np
import io
from PIL import Image
import cv2
from torchvision import transforms,models
import torch.nn as nn
def downloadImage(imageUrls):
tgts=[]
for url in imageUrls:
filename = url.split('/')[-1]
tgt = urllib.request.urlopen(url).read()
tgts.append((tgt,filename))
toTe = transforms.ToTensor()
nor = transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5])
co = transforms.Compose([
transforms.Resize(512),
transforms.CenterCrop(448),
transforms.ToTensor(),
transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5])
])
#tensors = torch.cat(tgs,dim=0)
classfier_anime_model = models.resnet18(pretrained=True)
num_ftrs = classfier_anime_model.fc.in_features
classfier_anime_model.fc = nn.Linear(num_ftrs, 2)
classfier_anime_model.load_state_dict(torch.load('classfierAnime'))
classfier_anime_model.eval()
classfier_Koishi_model = models.resnet18(pretrained=True)
num_ftrs_k = classfier_Koishi_model.fc.in_features
classfier_Koishi_model.fc = nn.Linear(num_ftrs_k, 2)
classfier_Koishi_model.load_state_dict(torch.load('classfierKoishi1'))
classfier_Koishi_model.eval()
with torch.no_grad():
for i,tg in enumerate(tgts):
t = Image.open(io.BytesIO(tg[0])).convert('RGB')
print(t)
t = co(t)
t = t.unsqueeze(0)
out = classfier_anime_model(t)
_,preds = torch.max(out,1)
print(out)
if preds[0]==1:
out_k = classfier_Koishi_model(t)
_,preds_k = torch.max(out_k,1)
if preds_k[0]==0:
with open("img/"+tg[1], mode='wb') as f:
f.write(tg[0])
else:
with open("imgKoishi/"+tg[1], mode='wb') as f:
f.write(tg[0])
else:
with open("realImg/"+tg[1], mode='wb') as f:
f.write(tg[0])
The identification rate of whether it was a real image was very good and satisfactory. There was almost no oversight in determining whether there was a specific character, but many of the items that were determined to be True did not meet the conditions. (Recall is expensive, but Pressision is that much) It is probable that the cause of this is that the saved images are biased and the identification of images that do not meet the conditions that were not saved was not successful. Another problem that occurred was that the image of the target character did not appear on the timeline at the beginning, so I was confused because I could not judge whether it could be identified. Through such actual operation, it was good to experience that data that was not learned hinders identification.
――I want to be able to save it regularly, but I can't leave my home computer on, so I want to be able to process it with a Raspberry Pi. --I want to be able to update the model automatically using the saved image. ――I didn't save the changes such as the model properly, so I want to save the data firmly so that I can leave it in the article. ――Learning is unstable, so I want to analyze it theoretically.
https://www.slideshare.net/takahirokubo7792/ss-71453093 https://qiita.com/koki-sato/items/c16c8e3445287698b3a8 http://docs.tweepy.org/en/v3.5.0/index.html https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html https://qiita.com/enmaru/items/2770df602dd7778d4ce6
Recommended Posts