[PYTHON] Challenge image classification by TensorFlow2 + Keras 2 ~ Let's take a closer look at the input data ~

Introduction

Let's try ** classification of handwritten digit images (MNIST) ** with ** TensorFlow2 + Keras ** in Google Colaboratory environment (+ deepen understanding of Python and deep learning). Last time has sample code from Official HP Tutorial of TensorFlow. I came to the point where I actually tried it.

--Challenge image classification by TensorFlow2 + Keras series -1. Move for the time being -2. Take a closer look at the input data -3. Visualize MNIST data -4. Let's make a prediction with the trained model -5. Observe images that fail to classify -6. Try preprocessing and classifying images prepared by yourself -7. Understanding layer types and activation functions -8. Select optimization algorithm and loss function -9. Try learning, saving and loading the model

According to "Illustrated Rapid Learning DEEP LEARNING (Author: Tomoaki Masuda)", ** MNIST ** has the following origins. Although not directly related here, the raw data is available at http://yann.lecun.com/exdb/mnist/.

One of the NISTs (National Institute of Standards and Technology databases) had a dataset with numbers handwritten by US Census Bureau staff and high school students. "M" NIST is a modified version of it that is easier to use with machine learning.

This time, we will explain the contents of ** training data ** (x_train, y_train) and ** test data ** (x_test, y_test) in the sample code shown last time. Take a closer look or use matplotlib to visualize it.

First of all, I will organize "** multi-class classification problem " and " deep learning **" (confirm the positioning of training data and test data).

Multi-class classification problem

Handwritten digit recognition belongs to the ** multiclass classification problem **. The multi-class classification problem is the problem of predicting the ** category (class) of the input data **. The category is given in advance ** like "dog" "cat" "bird" in the question setting, and it is "dog" "cat" "bird" for the input data (for example, image). Of these, the problem is to find out which category it belongs to.

多クラス分類.png

Various approaches have been proposed for the multi-class classification problem, but here we will solve it using ** deep learning ** (deep learning).

Deep learning

Deep learning belongs to a technique called ** supervised machine learning **. Supervised machine learning is roughly composed of ** 2 stages ** called "learning phase" and "prediction phase (inference phase, application phase)".

フェーズ.png

First, in the ** learning phase , a large number of pairs of ** input data ** and ** correct answer data ** (= teacher data, correct answer data, correct answer value, correct answer label) are given to the model. Let them learn their relationships. The pair set of these input data and correct answer data is called ** training data ** (= learning data). The model trained using the training data is called " trained model **".

イメージ.png

In the subsequent ** prediction phase **, ** unknown input data ** is given to the trained model to ** predict the output ** (Predict). For multi-class problems, the category (for example, "dog") is the predictive output.

Then, the process of ** Evaluate ** is to measure "how much performance the trained model has". In the evaluation, first, ** input data and correct answer data different from those used for training are prepared, and of these, only the input data ** is given to the trained model to obtain prediction data. Then, the obtained prediction data is answered using the correct answer data, scored, and used as the evaluation value. As specific evaluation indexes, in addition to the ** correct answer rate ** (accuracy) and ** loss function value ** (loss) that appeared last time, various items such as the precision rate and recall rate are available as needed. Will be adopted.

MNIST training data, test data

The following code downloads the MNIST data and stores it in each variable (x_train, y_train, x_test, y_test) (the whole program is [previous](https://qiita. See com / code0327 / items / 7d3c7bd3327ff049243a)).

python


mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Here, * _train is the input & correct answer data assigned for training (for learning), and * _test is the input & correct answer data assigned for testing (for model evaluation). There are 60,000 for training and 10,000 for testing.

In addition, x _ *** is the input data (that is, data representing a handwritten image: 256-step grayscale of 28x28), and y _ *** is the correct answer data (category from "0" to "9"). Is stored in an array.

First, let's check with len () that each of them is actually composed of 60,000 and 10,000 data.

python


#Training data
print(len(x_train))  #Execution result-> 60000
print(len(y_train))  #Execution result-> 60000
#Test data
print(len(x_test))   #Execution result-> 10000
print(len(y_test))   #Execution result-> 10000

Next, let's check the ** type ** of each data.

python


print(type(x_train)) #Execution result-> <class 'numpy.ndarray'>
print(type(y_train)) #Execution result-> <class 'numpy.ndarray'>
print(type(x_test))  #Execution result-> <class 'numpy.ndarray'>
print(type(y_test))  #Execution result-> <class 'numpy.ndarray'>

Next, let's check the contents of y_train (= correct answer data for training).

python


print(y_train) #Execution result-> [5 0 4 ... 5 6 8]

It was found that the correct answer value of the 0th data is "5", the correct answer value of the 1st data is "0" ..., and the correct answer value of the 59,999th data is "8".

Next, let's check the contents of x_train (= representing a handwritten image for training). Since it would be ridiculous to display all items, only the first x_train [0] is targeted.

python


(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train[0].shape) #Execution result-> (28, 28)
print(x_train[0])       #Execution result->See below

You can check the size of the data in numpy.ndarray with .shape. (28, 28), which means that x_train [0] is composed of ** 28 rows and 28 columns two-dimensional array **. Also, the output of print (x_train [0]) looks like this:

If you look at it with a light eye, you can see the slightly distorted handwritten number "5". This matches the "5" stored in y_train [0].

無題.png

You can see that each pixel data is composed of ** values in the range ** 0 to 255, where 0 is the background (white) and 255 is the darkest text (black).

I would like to check it for all 60,000 data.

python


import numpy as np
print(x_train.min())  #Extract the minimum value#Execution result-> 0
print(x_train.max())  #Extract maximum value#Execution result-> 255

You can see that all the data consists of the range 0-255.

By the way, how many numbers from "0" to "9" exist in the 60,000 training data? Basically, I think that 10 patterns from 0 to 9 exist almost evenly, but let's check. Use pandas for aggregation.

pandas version


import pandas as pd

tmp = pd.DataFrame({'label':y_train})
tmp = tmp.groupby(by='label').size()
display(tmp)
print(f'Total number={tmp.sum()}')

Execution result


label
0    5923
1    6742
2    5958
3    6131
4    5842
5    5421
6    5918
7    6265
8    5851
9    5949
dtype: int64
Total number=60000

There seems to be some variation, such as less "5" and more "1".

You can find it without using pandas as follows.

numpy version


import numpy as np
tmp = list([np.count_nonzero(y_train==p) for p in range(10)])
print(tmp)                #Execution result-> [5923, 6742, 5958, 6131, 5842, 5421, 5918, 6265, 5851, 5949]
print(f'Total number={sum(tmp)}') #Execution result-> Total number=60000

next time

--I wanted to go as far as displaying the input data graphically using matplotlib, but the article has become long, so I'd like to do that next time.

Recommended Posts

Challenge image classification by TensorFlow2 + Keras 2 ~ Let's take a closer look at the input data ~
Challenge image classification by TensorFlow2 + Keras 4 ~ Let's predict with trained model ~
Challenge image classification by TensorFlow2 + Keras 1-Move for the time being-
Challenge image classification with TensorFlow2 + Keras 3 ~ Visualize MNIST data ~
Take a closer look at the Kaggle / Titanic tutorial
Let's take a look at the feature map of YOLO v3
Challenge image classification with TensorFlow2 + Keras CNN 1 ~ Move for the time being ~
Challenge image classification by TensorFlow2 + Keras 5 ~ Observe images that fail to classify ~
Challenge image classification by TensorFlow2 + Keras 7-Understanding layer types and activation functions-
Let's take a look at the Scapy code. How are you processing the structure?
Let's look at the scatter plot before data analysis
Challenge image classification with TensorFlow2 + Keras 6-Try preprocessing and classifying images prepared by yourself-
Take a look at the Python built-in exception tree structure
Convert the image data (png) at hand to a .pbm image
Challenge image classification with TensorFlow2 + Keras 9-Learning, saving and loading models-
Take a look at the built-in exception tree structure in Python 3.8.2
Take a look at Django's template.
Let's take a look at the forest fire on the west coast of the United States with satellite images.
Let's take a look at the Scapy code. Overload of special methods __div__, __getitem__ and so on.
Judge Yosakoi Naruko by image classification of Tensorflow.