[PYTHON] Challenge image classification by TensorFlow2 + Keras 1-Move for the time being-

Introduction

This is a study memo (first) about image classification (Google Colaboratory environment) using TensorFlow2 + Keras. The subject is the classification of handwritten digit images (** MNIST **), which is a standard item.

--Challenge image classification by TensorFlow2 + Keras series -1. Move for the time being -2. Take a closer look at the input data -3. Visualize MNIST data -4. Let's make a prediction with the trained model -5. Observe images that fail to classify -6. Try preprocessing and classifying images prepared by yourself -7. Understanding layer types and activation functions -8. Select optimization algorithm and loss function -9. Try learning, saving and loading the model

Specifically, for the following ** images ** (28x28pixel) that capture handwritten characters from "0" to "9" MNIST-1.png Which of ** "0" to "9" can each image be classified? The content is to approach the problem (= multi-class classification problem) with deep learning (deep learning) by TensorFlow2 + Keras.

For the development and execution environment, use Google Colabo., Which is easy, convenient, and free. For the introduction of Google Colabo., Please refer to here.

In this article, I copied the sample code from TensorFlow's Official HP and pasted it into the code cell of Google Colab. Make sure you can do it without.

On top of that, "** what each part of the code is doing " and " what the text displayed at runtime conveys **" are loosely and vaguely explained.

What is TensorFlow?

--Read "tensor flow" or "tensor flow". --A machine learning library developed by Google that allows you to build and train (= learn / train) neural networks (NN). Of course, you can also make predictions using the trained NN model. --1.0 was released in February 2017, and 2.0 was released in October 2019. --In TF2.0, Keras (described later) was integrated to increase the affinity with Python, making it easier to use and more sophisticated (he said). GPU support has also been strengthened (he said). --Development is continuing to keep up with the latecomer machine learning library forces such as PyTorch.

Keras

--Read "Kerasu". --High-level API that supports TensorFlow as well as Theano. Rapper. --Written in Python. --By using TF via Keras, machine learning can be realized with simple and short code.

Try the sample code

Classification of handwritten digit image datasets (MINIST) in "Introduction to TensorFlow 2.0 for Beginners" on the official TensorFlow website. There is sample code (only a dozen lines) (classified into categories from "0" to "9"). Paste this into Google Colab. And run it.

Switch TF version from 1.x to 2.x

To use TensorFlow2, execute the following ** magic command ** in the code cell (paste it in the code cell and execute it with \ [Ctrl ] + \ [Enter ]). The reason for doing this is that as of December 27, 2019, Google Colab. Has set TensorFlow ** 1.x ** as the default, and to switch it to ** 2.x ** It is the processing of.

GoogleColab.Preparation at


%tensorflow_version 2.x

If there is no problem, it will be displayed as " TensorFlow 2.x selected. ".

If you execute TF (TensorFlow) of 1.x, the message " The default version of TensorFlow in Colab will soon switch to TensorFlow 2.x. </ Font>" will appear, so it's close. I don't think this procedure will be needed anymore (TF 2.x will be the default).

Sample code and execution

I have added a few comments to the sample code on the official website.

import tensorflow as tf

# (1)Download the handwritten digit image dataset (MNIST) and store it in a variable
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# (2)Data normalization (preprocessing for input data)
x_train, x_test = x_train / 255.0, x_test / 255.0

# (3)Building an NN model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

# (4)Compiling the model (including settings related to learning)
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])

# (5)Model training (learning / training)
model.fit(x_train, y_train, epochs=5)

# (6)Model evaluation
model.evaluate(x_test,  y_test, verbose=2)

In the above short program, we do the following:

--Download the handwritten digit image dataset and store it in each variable (data preparation) --* _ train: Data for training (for learning, training) --* _ test: Test (evaluation) data
For details of these data, see the second" ~ Let's take a closer look at the input data ~. / items / 2c969ca4675d5a3691ef) ”. --Data normalization (preprocessing for input data) --Convert integer values in the range 0-255 to real numbers in the range 0.0-1.0 --Construction of neural network model for machine learning --Details here will be explained in Part 7 "~ Understanding layer types and activation functions ~". --Compile model (including settings related to learning) --Details here will be explained in Part 8 "~ Selecting an optimization algorithm and loss function ~". --Model training using training data (* _ train) (→ ** Trained model ** completed) --Model evaluation using test data (* _ test) (execution of image classification by trained model and answer matching (scoring))

The execution result of the program is as follows.

Execution result


Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 5s 82us/sample - loss: 0.2992 - accuracy: 0.9134
Epoch 2/5
60000/60000 [==============================] - 5s 78us/sample - loss: 0.1457 - accuracy: 0.9561
Epoch 3/5
60000/60000 [==============================] - 5s 78us/sample - loss: 0.1096 - accuracy: 0.9659
Epoch 4/5
60000/60000 [==============================] - 5s 78us/sample - loss: 0.0876 - accuracy: 0.9730
Epoch 5/5
60000/60000 [==============================] - 5s 80us/sample - loss: 0.0757 - accuracy: 0.9765
10000/10000 - 0s - loss: 0.0766 - accuracy: 0.9762
[0.07658648554566316, 0.9762]

In terms of meaning ...

--Train on 60000 samples: We will train using 60,000 handwritten text images. ―― ʻEpoch x / 5: This is the xth learning out of 5 times in total. --5s 82us / sample --loss: 0.2992 --accuracy: 0.9134: It took 82 $ \ mu $ seconds per image, and about 5 seconds for the whole (60,000 images). The performance of the model trained in this way (evaluated using training data) was 0.2992 for the loss function value (loss) and 0.9134 for the correct answer rate (accuracy). --The correct answer rate of 0.9134 means that $ 60,000 \ times0.9134 = 54,804 $ images can be correctly classified from 0 to 9, and the remaining $ 60,000-54,804 = 5,196 $ images are misclassified. Interpret. --10000/10000 --0s --loss: 0.0766 --accuracy: 0.9762`: I tested the classification prediction with 10,000 images for testing (separate from the one used for training). The test took 0 seconds, with a loss function evaluation value (loss) of 0.0766 and a correct answer rate (accuracy) of 0.9762.

What is the accuracy rate?

Also called "accuracy" or "correct answer rate". Represents the percentage of images that have been correctly classified. For example, if 98 out of 100 images can be classified correctly, the correct answer rate will be $ 98/100 = 0.98 $ (= 98%).

The percentage of correct answers ranges from 0.0 to 1.0, and the larger the ** value (closer to 1.0), the better the model ** (when evaluated using data not used for training).

What is a loss function value (loss)?

There is a part where the superiority or inferiority of the model (classifier) cannot be measured only from the viewpoint of the correct answer rate. For example, suppose you want to classify (predict) one image (correct answer is "3") using two different models as follows.

ダウンロード.png

For this image, model A predicts "3" and model B also predicts "3". Since the correct answer is "3", the correct answer rate is 1.0 ** for both models. Looking only at this ** correct answer rate index **, the two models are equally good.

However, the prediction of model A is "** 8 is 10%, the confidence of 3 is 90%, and 3 is selected **", while the prediction of model B is "*". * 8 is 45%, 3 is 55%, and 3 is output ** "What if?

** Even with the same correct answer rate of 1.0 **, it can be said that Model A is superior.

However, this cannot be taken into consideration in the correct answer rate index. The one to evaluate it is the ** loss function **, and the value evaluated by the loss function is ** loss **.

The handwritten digit classification dealt with here belongs to the type "** multiclass classification problem **", and the loss function of this problem usually uses an index called ** cross entropy error ** (cross entropy). I will. The cross entropy is calculated using each value in the output layer of the neural network and the correct answer data). Details are explained in Part 8 "~ Selecting an optimization algorithm and loss function ~".

Basically, the loss function value takes a value of 0.0 or more, and ** the smaller the loss function value (closer to 0.0), the better the model **. The loss function value can exceed 1.0.

next time

――Next time, I would like to explain the training data (x_train, y_train) and test data (x_test, y_test) and visualize them using matplotlib.

Recommended Posts

Challenge image classification by TensorFlow2 + Keras 1-Move for the time being-
Challenge image classification with TensorFlow2 + Keras CNN 1 ~ Move for the time being ~
Challenge image classification by TensorFlow2 + Keras 4 ~ Let's predict with trained model ~
Challenge image classification by TensorFlow2 + Keras 5 ~ Observe images that fail to classify ~
Challenge image classification by TensorFlow2 + Keras 7-Understanding layer types and activation functions-
Flow memo to move LOCUST for the time being
Challenge image classification with TensorFlow2 + Keras 3 ~ Visualize MNIST data ~
Challenge image classification with TensorFlow2 + Keras 6-Try preprocessing and classifying images prepared by yourself-
I want to move selenium for the time being [for mac]
Understanding the python class Struggle (1) Let's move it for the time being
Challenge image classification with TensorFlow2 + Keras 9-Learning, saving and loading models-
I tried tensorflow for the first time
Python Master RTA for the time being
For the time being, import them into jupyter
Make a histogram for the time being (matplotlib)
Run yolov4 "for the time being" on windows
I played with Floydhub for the time being
Judge Yosakoi Naruko by image classification of Tensorflow.
Try using LINE Notify for the time being
virtualenv For the time being, this is all!
Run with CentOS7 + Apache2.4 + Python3.6 for the time being
Molecular dynamics simulation to try for the time being
Identify the name from the flower image with keras (tensorflow)
I will install Arch Linux for the time being.
Next to Excel, for the time being, jupyter notebook
I tried running PIFuHD on Windows for the time being
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
For the time being, try using the docomo chat dialogue API
I want to create a Dockerfile for the time being.
Kaggle for the first time (kaggle ①)
CNN (1) for image classification (for beginners)
Kaguru for the first time
Let's touch Google's Vision API from Python for the time being
For the time being, I want to convert files with ffmpeg !!
Try using FireBase Cloud Firestore in Python for the time being
[For self-learning] Go2 for the first time
See python for the first time
Start Django for the first time
[Python] [Machine learning] Beginners without any knowledge try machine learning for the time being