The result of making the first thing that works with Python (image recognition)

1.First of all

Hello, this is Magicchic of Aidemy trainees. What kind of image do you have of programming? It's been about two months since I first came into contact with programming, but I still feel that it's difficult. However, I would be happy if I could improve the error and make it work. This time, I would like to share the results of such a beginner trying to make something that works for the first time using python.

2 What is image recognition?

Recently, technology called image recognition is playing an active role, such as face recognition of cameras and detection of defective products at factories. There is also a similar entertainer diagnosis in the app so that you can judge the type of animal from the photo. CNN (Convolutional Neural Network) is a technology that realizes such advanced image recognition. Convolution Neural Network is one of the learning methods for AI to perform image analysis, and it can analyze even images that are partially difficult to see. It is also called CNN for short.

It is a forward-propagating network with a structure that includes two layers, a convolution layer and a pooling layer. It has a combination of "weight sharing".

It can be said that it is a neural network that incorporates a "structure" of two hidden layers that have been devised in addition to the "multilayer structure".

After the image to be analyzed is loaded into the input layer, the filter is used to scan the data all over and extract the features (gradient, unevenness, etc.) of the data. The extracted feature data is sent to the convolution layer, where more condensed feature data is created. image.png

And Keras is the library that made that CNN easy for everyone to use. If you want to create an image recognition program, creating a CNN in Keras is a shortcut.

What is keras

Keras is a high-level neural network library written in Python that can be run on TensorFlow or Theano. Keras was developed with a focus on enabling rapid experimentation. It is important for good research to be able to move from idea to result as quickly as possible. (From keras official document)

3 Installation and procedure

I decided to judge the car model as the theme set this time. I'm totally ignorant of cars. I think it's amazing to see people who can guess the type of car just by looking at the car. I don't mean to compete, but I wish I could do the same with machine learning. Three target models are picked up from Toyota's domestic luxury cars


The general procedure is as follows.

  1. Image collection
  2. Convert the data and then save it to the training data as npy
  3. Increase data
  4. Construction of learning model and evaluation function
  5. Result

4 Description

From here on, follow the steps shown above. This time, I collected it using icrawler.

pip install icrawler

Run this in your terminal and first install icrawler. Then in a text editor

from icrawler.builtin import BingImageCrawler
crawler = BingImageCrawler(storage={"root_dir": "toyotacentury"})
crawler.crawl(keyword="toyotacentury", max_num=100)

from icrawler.builtin import BingImageCrawler
crawler = BingImageCrawler(storage={"root_dir": "toyotacrown"})
crawler.crawl(keyword="toyotacrown", max_num=120)

from icrawler.builtin import BingImageCrawler
crawler = BingImageCrawler(storage={"root_dir": "toyotamarkx"})
crawler.crawl(keyword="toyotamarkx", max_num=120)

As a result of visually deleting the data that could not be used as data (irrelevant or unclear data), the number of data collected was 80 each. Save them as npy.

from PIL import Image
import os, glob
import numpy as np
import sklearn
from sklearn import model_selection

classes = ["toyotacentury", "toyotacrown", "toyotamarkx"]
num_classes = len(classes)
image_size = 100

#Load image, convert to numpy array
X = []
Y = []
for index, classlabel in enumerate(classes):
    photos_dir = "./" + classlabel
    files = glob.glob(photos_dir + "/*.jpg ")
    for i, file in enumerate(files):
        if i >= 93: break
        image = Image.open(file)
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        data = np.asarray(image)
        X.append(data)
        Y.append(index)
#Convert from list to numpy
X = np.array(X)
Y = np.array(Y)
#Divide the data for training and evaluation
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, Y)
xy = (X_train, X_test, y_train, y_test)
np.save("./toyotacar.npy", xy)

Next is the work of increasing the data. Since the current number should not be enough, increase the number of images in the target folder.

import os
import glob
import numpy as np
from keras.preprocessing.image import ImageDataGenerator,load_img, img_to_array, array_to_img

#Functions that extend the image
def draw_images(generator, x, dir_name, index):
    save_name = 'extened-' + str(index)
    g = generator.flow(x, batch_size=1, save_to_dir=output_dir,
                       save_prefix=save_name, save_format='jpeg')

    #Specify how many images to expand from one input image (10 images this time)
    for i in range(10):
        bach = g.next()

#Output destination folder settings
output_dir = "toyotacenturyzou"

if not(os.path.exists(output_dir)):
    os.mkdir(output_dir)

#Image loading to expand
images = glob.glob(os.path.join("toyotacentury", "*.jpg "))

#Define ImageDataGenerator
datagen = ImageDataGenerator(rotation_range=20,
                            width_shift_range=0,
                            shear_range=0,
                            height_shift_range=0,
                            zoom_range=0,
                            horizontal_flip=True,
                            fill_mode="nearest",
                            channel_shift_range=40)

#Image expansion
for i in range(len(images)):
    img = load_img(images[i])
    img = img.resize((350,300 ))
    x = img_to_array(img)
    x = np.expand_dims(x, axis=0)
    draw_images(datagen, x, output_dir, i)

import os
import glob
import numpy as np
from keras.preprocessing.image import ImageDataGenerator,load_img, img_to_array, array_to_img

#Functions that extend the image
def draw_images(generator, x, dir_name, index):
    save_name = 'extened-' + str(index)
    g = generator.flow(x, batch_size=1, save_to_dir=output_dir,
                       save_prefix=save_name, save_format='jpeg')

    #Specify how many images to expand from one input image (10 images this time)
    for i in range(10):
        bach = g.next()

#Output destination folder settings
output_dir = "toyotacrownzou"

if not(os.path.exists(output_dir)):
    os.mkdir(output_dir)

#Image loading to expand
images = glob.glob(os.path.join("toyotacrown", "*.jpg "))

#Define ImageDataGenerator
datagen = ImageDataGenerator(rotation_range=20,
                            width_shift_range=0,
                            shear_range=0,
                            height_shift_range=0,
                            zoom_range=0,
                            horizontal_flip=True,
                            fill_mode="nearest",
                            channel_shift_range=40)

#Image expansion
for i in range(len(images)):
    img = load_img(images[i])
    img = img.resize((350,300 ))
    x = img_to_array(img)
    x = np.expand_dims(x, axis=0)
    draw_images(datagen, x, output_dir, i)


import os
import glob
import numpy as np
from keras.preprocessing.image import ImageDataGenerator,load_img, img_to_array, array_to_img

#Functions that extend the image
def draw_images(generator, x, dir_name, index):
    save_name = 'extened-' + str(index)
    g = generator.flow(x, batch_size=1, save_to_dir=output_dir,
                       save_prefix=save_name, save_format='jpeg')

    #Specify how many images to expand from one input image (10 images this time)
    for i in range(10):
        bach = g.next()

#Output destination folder settings
output_dir = "toyotamarkxzou"

if not(os.path.exists(output_dir)):
    os.mkdir(output_dir)

#Image loading to expand
images = glob.glob(os.path.join("toyotamarkx", "*.jpg "))

#Define ImageDataGenerator
datagen = ImageDataGenerator(rotation_range=20,
                            width_shift_range=0,
                            shear_range=0,
                            height_shift_range=0,
                            zoom_range=0,
                            horizontal_flip=True,
                            fill_mode="nearest",
                            channel_shift_range=40)

#Image expansion
for i in range(len(images)):
    img = load_img(images[i])
    img = img.resize((350,300 ))
    x = img_to_array(img)
    x = np.expand_dims(x, axis=0)
    draw_images(datagen, x, output_dir, i)

This work has increased the number of images to 800, which is 10 times the original number of images.


Finally, let's move on to the construction of the learning model and evaluation function.

The first thing to note is that if you specify data in () with np.load when defining the main function, an error may occur. I think it will be improved if you specify the allow_pickle option there. See the link below (https://qiita.com/ytkj/items/ee6e1125476883923db8)

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.utils import np_utils
import keras
import numpy as np
from keras.optimizers import RMSprop

classes = ["toyotacentury", "toyotacrown", "toyotamarkx"]
num_classes = len(classes)
image_size = 100

#Definition of main function

def main():
    X_train, X_test, y_train, y_test = np.load("./toyotacar.npy", allow_pickle=True)#Read data from a file into an array
    X_train = X_train.astype("float") / 256#Normalize data
    X_test = X_test.astype("float") / 256
    y_train = np_utils.to_categorical(y_train, num_classes)
    y_test = np_utils.to_categorical(y_test, num_classes)

#Calling training and evaluation functions
    model = model_train(X_train, y_train)
    model_eval(model, X_test, y_test)
    
def model_train(X, y):
    model = Sequential()
    model.add(Conv2D(32,(3,3), padding='same',input_shape=X.shape[1:]))
    model.add(Activation('relu'))
    model.add(Conv2D(32,(3,3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))

    model.add(Conv2D(64,(2,2), padding='same'))
    model.add(Activation('relu'))
    model.add(Conv2D(64,(3,3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(3,3)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(3))
    model.add(Activation('softmax'))
#Optimization process
    opt = keras.optimizers.RMSprop(lr=0.0001, decay=1e-6)
#Try to reduce the error between the correct answer and the estimated value
    model.compile(loss='categorical_crossentropy',optimizer=opt,metrics=['accuracy'])
    model.fit(X, y, batch_size=20, epochs=75)

    #Save model
    model.save('./toyota_cnn.h5')

    return model

def model_eval(model, X, y):
    scores = model.evaluate(X, y, verbose=1)
    print('Test Loss: ', scores[0])
    print('Test Accuracy: ', scores[1])

if __name__ == "__main__":
    main()

You may get an AttributeError when you run it (maybe just me), I've attached the link below for a detailed explanation of the cause. (https://ja.stackoverflow.com/questions/48286/python%e3%81%a7attributeerror)

The execution result is as follows.

Test Loss:  2.74328875541687
Test Accuracy:  0.4833333194255829

5 Consideration of results and future prospects

In the result of this execution, accuracy is less than 50% and the accuracy is inaccurate, and loss is far from 0. I would like to find the optimum solution by cropping the image, increasing the number of images, and changing the number of epochs to increase the numerical value.

Especially on the theme of "car", there is not much difference in the characteristics of a large framework. After making accuracy 1.0 and loss as close as possible, I would like to define a function that receives an image and makes a judgment, and executes the prediction.

Reference schedule link for future creation code

(https://qiita.com/kenichiro-yamato/items/b64c70882473904600bf)

reference

(https://qiita.com/kazama0119/items/ede4732d21fe00085eb6) (https://qiita.com/keimoriyama/items/846a3462a92c8c5661ff) (https://qiita.com/keimoriyama/items/7b09d7c1797fcee6a2b0) (https://udemy.benesse.co.jp/data-science/ai/convolution-neural-network.html) (https://dev.classmethod.jp/articles/introduction-keras-deeplearning/)

Recommended Posts

The result of making the first thing that works with Python (image recognition)
The story of making a module that skips mail with python
[Image recognition] How to read the result of automatic annotation with VoTT
Extract the table of image files with OneDrive & Python
I tried to find the entropy of the image with python
I tried "gamma correction" of the image with Python + OpenCV
[Python] The first step to making a game with Pyxel
First Python 3 ~ The beginning of repetition ~
A story that supports electronic scoring of exams with image recognition
Basics of binarized image processing with Python
Python: Basics of image recognition using CNN
The story of making Python an exe
Check the existence of the file with python
Python: Application of image recognition using CNN
Easy introduction of speech recognition with Python
The result of installing python in Anaconda
Visualize point P that works with Python
Drawing with Matrix-Reinventor of Python Image Processing-
The story of making a university 100 yen breakfast LINE bot with Python
Try to image the elevation data of the Geographical Survey Institute with Python
Be careful of the type when making an image mask with Numpy
The story of making a tool to load an image with Python ⇒ save it as another name
I tried "smoothing" the image with Python + OpenCV
I tried image recognition of CIFAR-10 with Keras-Learning-
Prepare the execution environment of Python3 with Docker
2016 The University of Tokyo Mathematics Solved with Python
I tried "differentiating" the image with Python + OpenCV
I tried image recognition of CIFAR-10 with Keras-Image recognition-
Color page judgment of scanned image with python
[Note] Export the html of the site with python.
View the result of geometry processing in Python
Calculate the total number of combinations with python
Around the authentication of PyDrive2, a package that operates Google Drive with Python
Check the date of the flag duty with Python
Rewrite the sampling node of SPSS Modeler with Python ①: First N cases, random sampling
The story of making a web application that records extensive reading with Django
Image processing? The story of starting Python for
I tried "binarizing" the image with Python + OpenCV
Convert the character code of the file with Python3
One-liner that outputs 10000 digits of pi with Python
[Python] Determine the type of iris with SVM
[Python + OpenCV] Whiten the transparent part of the image
The first step of machine learning ~ For those who want to implement with python ~
The result of making a map album of Italy honeymoon in Python and sharing it
Second half of the first day of studying Python Try hitting the Twitter API with Bottle
[Python] Understand the self of the class. Learn the role of self from the execution result with or without self.
[Python] A program that counts the number of valleys
A story about making 3D space recognition with Python
Learn Nim with Python (from the beginning of the year).
The first algorithm to learn with Python: FizzBuzz problem
[Python] Get the numbers in the graph image with OCR
Destroy the intermediate expression of the sweep method with Python
[OpenCV / Python] I tried image analysis of cells with OpenCV
Visualize the range of interpolation and extrapolation with python
Convert the image in .zip to PDF with Python
Calculate the regression coefficient of simple regression analysis with python
A memo that I touched the Datastore with python
Summary of the basic flow of machine learning with Python
Get the operation status of JR West with Python
Visualize the appreciation status of art works with OpenCV
Why is the first argument of [Python] Class self?