[PYTHON] Creating a learning model using MNIST

What is MNIST

MNIST is an image data set used to recognize handwritten numbers. This time, we will try to create a learning model that can identify handwritten numbers using such MNIST.

MNIST data

It is composed of image data (image) that is an image of numbers written by hand and label data (label) that represents the numbers written on the image.

image.png

Only 60,000 of these pairs are provided for learning and 10,000 for verification.

Contents of actual image data

In order to understand the learning model more deeply, let's check what kind of data is actually contained in MNIST.


import sys
import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from PIL import Image
 
(X_train, y_train), (X_test, y_test) = mnist.load_data()
 
#X_Train imaging and y_train value
#Let's examine the image data of the first training
train_no = 0
 
print('Training image')
for xs in X_train[train_no]:
    for x in xs:
        sys.stdout.write('%03d ' % x)
    sys.stdout.write('\n')
    
outImg = Image.fromarray(X_train[train_no].reshape((28,28))).convert("RGB")
outImg.save("train.png ")
 
print('Training label(y_train) = %d' % y_train[train_no])
 
#X_test imaging and y_value of test
#Let's examine the image data of the first test
test_no = 0
 
print('Test image')
for xs in X_test[test_no]:
    for x in xs:
        sys.stdout.write('%03d ' % x)
    sys.stdout.write('\n')
    
outImg = Image.fromarray(X_test[test_no].reshape((28,28))).convert("RGB")
outImg.save("test.png ")
 
print('Test label(y_test) = %d' % y_test[test_no])

This program displays the first learning data and the first test data.

X_train = learning image data, y_train = learning label, X_test = test image data, y_test = test label, so if you check the contents of each, the following execution results will be obtained.

Training image
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 003 018 018 018 126 136 175 026 166 255 247 127 000 000 000 000 
000 000 000 000 000 000 000 000 030 036 094 154 170 253 253 253 253 253 225 172 253 242 195 064 000 000 000 000 
000 000 000 000 000 000 000 049 238 253 253 253 253 253 253 253 253 251 093 082 082 056 039 000 000 000 000 000 
000 000 000 000 000 000 000 018 219 253 253 253 253 253 198 182 247 241 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 080 156 107 253 253 205 011 000 043 154 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 014 001 154 253 090 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 139 253 190 002 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 011 190 253 070 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 035 241 225 160 108 001 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 081 240 253 253 119 025 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 045 186 253 253 150 027 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 016 093 252 253 187 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 249 253 249 064 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 046 130 183 253 253 207 002 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 039 148 229 253 253 253 250 182 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 024 114 221 253 253 253 253 201 078 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 023 066 213 253 253 253 253 198 081 002 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 018 171 219 253 253 253 253 195 080 009 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 055 172 226 253 253 253 253 244 133 011 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 136 253 253 253 212 135 132 016 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
Training label(y_train) = 5
Test image
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 084 185 159 151 060 036 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 222 254 254 254 254 241 198 198 198 198 198 198 198 198 170 052 000 000 000 000 000 000 
000 000 000 000 000 000 067 114 072 114 163 227 254 225 254 254 254 250 229 254 254 140 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 017 066 014 067 067 067 059 021 236 254 106 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 083 253 209 018 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 022 233 255 083 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 129 254 238 044 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 059 249 254 062 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 133 254 187 005 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 009 205 248 058 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 126 254 182 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 075 251 240 057 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 019 221 254 166 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 003 203 254 219 035 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 038 254 254 077 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 031 224 254 115 001 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 133 254 254 052 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 061 242 254 254 052 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 121 254 254 219 040 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 121 254 207 018 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 
Test label(y_test) = 7

You can see the numbers 5 and 7 faintly. And the label was displayed as well.

Creating a learning model using MNIST

A machine learning model is one that receives an input value in a form that can be understood by a computer, makes some evaluation / judgment, and outputs an output value. What is needed to develop a model is a huge amount of training data, which MNIST will be responsible for this time.       After repeatedly learning from all angles, test to measure comprehension. This is called evaluating the accuracy of the model, and test data is used there.

Let's actually create a learning model.

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import Adam 
from keras.utils import np_utils
 
def build_model():
 #Creating a model
 model = Sequential()
 model.add(Dense(512, input_shape=(784,)))
 model.add(Activation('relu'))
 model.add(Dropout(0.2))
 
 model.add(Dense(512))
 model.add(Activation('relu'))
 model.add(Dropout(0.2))
 
 model.add(Dense(10))
 model.add(Activation('softmax'))
 
 #Definition of loss function
 model.compile(
 loss='categorical_crossentropy',
 optimizer=Adam(),
 metrics=['accuracy'])
 
 return model
 
if __name__ == "__main__":
 #Read MNIST data
 #Training data 60,000, test data 10,000
 #28 pixels x 28 pixels=784 pixel data
 #Colors are 0-255
 (X_train, y_train), (X_test, y_test) = mnist.load_data()
 X_train = X_train.reshape(60000, 784).astype('float32')
 X_test = X_test.reshape(10000, 784).astype('float32')
 X_train /= 255
 X_test /= 255
 
 #Convert to a 10-dimensional array//The number 5 looks like this[0,0,0,0,1,0,0,0,0,0]
 y_train = np_utils.to_categorical(y_train, 10)
 y_test = np_utils.to_categorical(y_test, 10)
 
 #Training with data This time, we will learn twice because of time saving
 model = build_model()
 model.fit(X_train, y_train, 
 nb_epoch=2, #Number of times to learn 2 times this time as you like pytyon nb_epoch is a repetition of range
 batch_size=128, #128 images were acquired at random. Any number is fine
 validation_data=(X_test, y_test)
 )
 
 
 #Saving the learning model
 json_string = model.to_json()
 #Model file name extension.json
 open('mnist.json', 'w').write(json_string)
 #Save weight file Extension is hdf5
 model.save_weights('mnist.hdf5')
 
 #Evaluate the model
 score = model.evaluate(X_test, y_test, verbose=1)
 
 print('loss=', score[0])
 print('accuracy=', score[1])

There are two learning models, a json file and an hdf5 file.

The execution result is

Train on 60000 samples, validate on 10000 samples
Epoch 1/2
60000/60000 [==============================] - 9s 152us/step - loss: 0.2453 - accuracy: 0.9264 - val_loss: 0.0981 - val_accuracy: 0.9683
Epoch 2/2
60000/60000 [==============================] - 9s 150us/step - loss: 0.1009 - accuracy: 0.9693 - val_loss: 0.0752 - val_accuracy: 0.9760
10000/10000 [==============================] - 1s 65us/step
loss= 0.07517002068450675
accuracy= 0.9760000109672546

We were able to create a model with a correct answer rate of 97%.

Since the data is easy to understand, we learned with a small number of epoch = 2 in order to save time, but we can get some accuracy.

Actually, you learn with numbers such as 500 and 1000.

Summary

Deepened my understanding of machine learning and MNIST. I would like to devise the code I wrote this time to improve the accuracy.

Recommended Posts

Creating a learning model using MNIST
[Day 9] Creating a model
PyTorch Learning Note 2 (I tried using a pre-trained model)
Creating a position estimation model for the Werewolf Intelligence Tournament using machine learning
Creating a web application using Flask ②
Creating a simple table using prettytable
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
Inversely analyze a machine learning model
Creating a web application using Flask ①
Creating a web application using Flask ③
Creating a web application using Flask ④
Creating a Tensorflow Sequential model with original images added to MNIST
Python: Introduction to Flask: Creating a number identification app using MNIST
Image recognition model using deep learning in 2016
Creating a data analysis application using Streamlit
Get a reference model using Django Serializer
Creating a development environment for machine learning
I made a Dir en gray face classifier using TensorFlow --- ⑦ Learning model
A story about simple machine learning using TensorFlow
Reinforcement learning 10 Try using a trained neural network.
Face image dataset sorting using machine learning model (# 3)
[Python] Implementation of clustering using a mixed Gaussian model
Creating a Home screen
Try creating a compressed file using Python and zlib
4. Creating a structured program
Building a seq2seq model using keras's Functional API Overview
Creating a graph using the plotly button and slider
I tried hosting a Pytorch sample model using TorchServe
Creating a scraping tool
Learning neural networks using Chainer-Creating a Web API server
Machine Learning with Caffe -1-Category images using reference model
[MNIST] I tried Fine Tuning using the ImageNet model.
Building a seq2seq model using keras' Functional API Inference
Creating a dataset loader
[Machine learning] Text classification using Transformer model (Attention-based classifier)
Memo for building a machine learning environment using Python
xgboost: A valid machine learning model for table data
(Note) A story about creating a question answering system using Spring Boot and machine learning (SVM)
Implementation of VGG16 using Keras created without using a trained model
100 Language Processing Knock-84 (using pandas): Creating a word context matrix
A memo when creating a directed graph using Graphviz in Python
Launching a machine learning environment using Google Compute Engine (GCE)
I tried using Tensorboard, a visualization tool for machine learning
I made a VGG16 model using TensorFlow (on the way)
Create a python machine learning model relearning mechanism with mlflow
User is not added successfully after creating a custom User model
Try to model a multimodal distribution using the EM algorithm
The story of creating a database using the Google Analytics API
I tried to divide with a deep learning language model
Machine learning model considering maintainability
Creating Spigot plugins using Eclipse
Time measurement using a clock
Pepper Tutorial (5): Using a Tablet
Using a printer with Debian 10
Build a machine learning environment
Learning model creation, learning and reasoning
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 3 [Character recognition using a model]
A memorandum of using eigen3
Create an API that returns data from a model using turicreate
Creating artificial intelligence by machine learning using TensorFlow from zero knowledge-Introduction 1
[Tutorial] Make a named entity extractor in 30 minutes using machine learning