Introduction

Last time, we introduced the Deep Learning library Keras to Windows and performed binary classification of whether the quadrangle is portrait or landscape. If you haven't read it yet, it may be easier to understand from there. By the way, this time I think that binary classification is not interesting until using deep learning, so as the next step, I will perform multi-class classification of numbers using a data set of handwritten characters of numbers 0 to 9 called MNIST. .. In this article, the network will continue to use the Multilayer Perceptron (MLP). I haven't talked about multi-layer perceptrons, so let's talk briefly first. The initial value and the parameter explanation of optimization will be taken up separately in about 5.

Execution environment

Windows 10（64bit）with only CPU
Python 3.5.2
Anaconda 4.2.0（64bit）
Tensorflow 0.12.1
Keras 1.2.1

What is Multilayer Perceptron?

perceptron

What exactly is a perceptron? According to the first book in the bibliography

Perceptron is an algorithm devised in 1957 by an American researcher named Rosen Blood, which is the origin of neural net nets (deep learning).

It seems that (the expression has changed a little). I don't think I can tell exactly what this is, so I will explain what the perceptron itself is. To explain in words, it is "a device that receives multiple signals as inputs and outputs one signal", and the output signal takes two values, 0 or 1. The neuron calculates the sum of the signals sent and outputs 1 only when the sum exceeds a certain limit. It is said that this is sometimes expressed as "** neurons fire **". The limit value is called the threshold value, and when expressed by $ \ theta $, the output $ y $ can be expressed by the following formula.

\begin{eqnarray}
y=\left\{ \begin{array}{ll}
0 & (\omega_1 x_1 + \omega_2 x_2 \leq \theta) \\
1 & (\omega_1 x_1 + \omega_2 x_2 > \theta) \\
\end{array} \right.
\end{eqnarray}

That is, the perceptron ** weight $ \ omega $ represents the signal importance **, and the ** bias $ b $ is the ** parameter that adjusts the ease of firing. The bias $ b $ is $ b =-\ theta $. In some contexts, this bias is also called weight. However, as you can see from the above equation, only a single perceptron can identify a straight line. Therefore, it is not possible to deal with non-linear events.

Multilayer perceptron

Simple perceptrons can only be used for linear problems. However, the perceptrons can be layered, allowing them to handle non-linear problems as well. This is called a multi-layer perceptron. Here are some things to keep in mind. What is commonly referred to as a multi-layer perceptron is different from the simple perceptron described earlier. In the multi-layer perceptron, the output value is 0 or 1 for the perceptron, whereas the output value is a real number from 0 to 1 when the activation function is sigmoid. Well, suddenly I came up with the word activation function. The activation function is that the simple perceptron determines whether the output is 0 or 1 by the threshold value, whereas in the multi-layer perceptron, it is output by passing the nonlinear activation function $ h $ with the sum of the inputs as a variable. Determines the real number of.

\begin{eqnarray}
  a =& \omega_1 x_1 + \omega_2 x_2 + b\\
  y =& h(a)
\end{eqnarray}

For example, the sigmoid function is the following function, which outputs a real value between 0 and 1 depending on the value of the input variable. sigmoid

\begin{equation}
h(x) = \frac{1}{1+\exp(-x)}
\end{equation}

Finally, the multi-layer perceptron can be summarized as follows. -** Simple perceptron ** is a network in which ** step function ** is applied to the activation function in a ** single layer ** network. -** Multilayer Perceptron ** is a network that applies ** non-linear functions such as sigmoid ** to the activation function in ** neural network (multilayer) **.

Identification of handwritten characters

Could you somehow understand about the multi-layer perceptron? At this point, we finally identify handwritten characters using a multi-layer perceptron.

Learning

MNIST can read training and test data as a dataset in Keras. So you don't have to download the dataset like you did last time. You can view the learning and its results with a script like this: I used the sigmoid function for the activation function of the multi-layer perceptron last time, but this time I am using the ReLU function. I'll explain why I'm using the ReLU function because it's more accurate, but I'll explain why in another article. Also, be sure to set the activation function of the final layer to softmax when you want to classify. Otherwise, it will not be classified. The Jupyter Notebook has been uploaded to Gist.

`mnist_detect_class10.py`


import numpy as np
import matplotlib.pyplot as plt

from keras.utils import np_utils
from scipy.misc import toimage

#Required to read the MNIST dataset
from keras.datasets import mnist

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation

#Setting the number of classes to classify
class_num = 10

#Load mnist dataset
(X_train, y_train),(X_test, y_test) = mnist.load_data()

#Normalize training data
X_train = X_train/255
X_test = X_test/255

#Format data for input
#Convert a 2D array of 28 pixels x 28 pixels per image into a 784-dimensional vector
X_train = X_train.reshape(-1,784)
X_test = X_test.reshape(-1,784)

#Change the label to an array corresponding to the number of classes
#Example: y_train:[0 1 7 5] -> Y_train:[[1 0 0 0 0 0 0 0 0 0],[0 1 0 0 0 0 0 0 0 0],[0 0 0 0 0 0 0 1 0 0],[0 0 0 0 0 1 0 0 0 0]]
Y_train = np_utils.to_categorical(y_train,class_num)
Y_test = np_utils.to_categorical(y_test,class_num)

#Networking of Multilayer Perceptron
#Input 784 dimensions(28x28)And set the final output to the number of classes
model = Sequential()
model.add(Dense(512, input_dim=784, init='uniform'))
#Use ReLU function for activation function
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(512, init='uniform'))
#Use ReLU function for activation function
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(class_num, input_dim=512, init='uniform'))
#In the case of classification, always use Softmax for the final layer
model.add(Activation('softmax'))

#Since it is a multi-valued classification, select categorical, and select RMSprop as the optimization algorithm.
model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

#Train the model with a fixed number of trials (dataset iterations)
model.fit(X_train, Y_train,
          nb_epoch=5,
          batch_size=100)
#Evaluate post-learning parameters
score = model.evaluate(X_test, Y_test, batch_size=100)
print(score)
#Estimate the label of the test data, the label will be returned as the return value
classified_label = model.predict_classes(X_test[0:10,:])

#Display of estimated label and image
for i in range(10):
    #Convert vector data to array and convert it to image data
    img = toimage(X_test[i,:].reshape(28,28))
    plt.subplot(2,5,i+1)
    plt.imshow(img, cmap='gray')
    plt.title("Class {0}".format(classified_label[i]))
    plt.axis('off')
plt.show()

#Saving models and weights
model_json = model.to_json()
open('mnist_architecture.json', 'w').write(model_json)
model.save_weights('mnist_weights.h5', overwrite=True)

result

The 10 estimated labels and images from the beginning of the test data are as follows. You can see that the class and the character match. By the way, the test result of this image was a level that was sufficiently practical with a correct answer rate of 97.7%. Looking at the image of the test, it seems that 5 is mistaken for 8, but it can be classified correctly. By the way, with sigmoid, the correct answer rate was only about 80%, so it is important to set appropriate parameters.

Experiment with numbers you wrote

At this point, the learning seems to be fairly accurate. However, I still want to test with the characters I wrote. So, I created an image with numbers from 0 to 9 written on a canvas of 56 pixels x 56 pixels with paint. I tried to predict the class by reading the trained weights with the following code. The Jupyter Notebook has been uploaded to Gist.

`my_mnist.py`


import numpy as np
import matplotlib.pyplot as plt

from keras.utils import np_utils

from keras.models import Sequential,model_from_json
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input, decode_predictions

#Number of images prepared for the test
test_num = 10

#Model loading
model = model_from_json(open('mnist_architecture.json').read())
#Load model weights
model.load_weights('mnist_weights.h5')

for i in range(test_num):
    #Specifying the file name
    img_path = str(i)+".jpg "
    #28 pixels x 28 pixels, import images in grayscale
    img = image.load_img(img_path, grayscale=True, target_size=(28, 28))
    #Convert the imported image to an array
    x = image.img_to_array(img)
    #Checking the array size
    print(x.shape)
    
    #Shape a 28x28 2D array into a vector of size 784 for input to the model
    x = x[:,:,0].reshape(-1,784)

    #Class prediction
    classified_label = model.predict_classes(x)
    
    #Prediction result plot
    plt.subplot(2,5,i+1)
    plt.imshow(img, cmap='gray')
    plt.title("Class {0}".format(classified_label[0]))
    plt.axis('off')
plt.show()

Prediction result of your handwritten characters

I thought that all the prediction results were correct, but the number I wrote showed that the correct answer rate was 70%. This is probably because it is different from the handwriting of the dataset. It's not very neat, so I'll post what I learned on CNN here in a postscript.

Execution environment and source code during CNN learning

Execution environment

It takes too much time for a normal PC with only a CPU to train CNN, so I trained it on the cloud with a GPU environment. The execution environment at that time was like this. Learning MNIST in this environment took less than a minute.

Ubuntu 16.04.1 LTS
Geforce GTX TITAN X
Python 2.7.12
Tensorflow-gpu 0.12.1
Keras 1.2.0

Source code at the time of learning

`my_mnist_cnn_learning.py`


import numpy as np

from keras.utils import np_utils

from keras.datasets import mnist

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras import backend as K

#Number of batches
batch_size = 128
#Number of classes to classify
nb_classes = 10
#Number of epochs
nb_epoch = 12

#Number of dimensions of the image to be input
img_rows, img_cols = 28, 28
#Number of filters used for convolution
nb_filters = 32
#Specify the size of the pooling layer
pool_size = (2, 2)
#Specifies the size of the convolution kernel
kernel_size = (3, 3)

#Read mnist data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

if K.image_dim_ordering() == 'th':
        X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
        X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
        input_shape = (1, img_rows, img_cols)
else:
        X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
        X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
        input_shape = (img_rows, img_cols, 1)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

#Normalization
X_train /= 255
X_test /= 255

#Convert to binary matrix
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

#CNN modeling
model = Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                                                border_mode='valid',
                                                input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

#Compiling the model
model.compile(loss='categorical_crossentropy',
                            optimizer='adadelta',
                            metrics=['accuracy'])

#Learning
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
                    verbose=1, validation_data=(X_test, Y_test))
#Model evaluation
score = model.evaluate(X_test, Y_test, verbose=0)
print(score)

#Saving models and weights
model_json = model.to_json()
open('mnist_cnn_architecture.json', 'w').write(model_json)
model.save_weights('mnist_cnn_weights.h5', overwrite=True)

Learning results on CNN

I was running it in the cloud earlier, but the following script is in the first Windows environment Running

`my_mnist_cnn.py`


# coding: utf-8
import numpy as np
import matplotlib.pyplot as plt

from keras.utils import np_utils

from keras.models import Sequential,model_from_json
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input, decode_predictions

from keras.datasets import mnist

#Number of classified classes
nb_classes = 10
#Number of images prepared for the test
test_num = 10
#Read mnist data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_test = X_test.astype('float32')
#Normalize data
X_test /= 255

#Convert to binary array
Y_test = np_utils.to_categorical(y_test, nb_classes)

#Model loading
model = model_from_json(open('mnist_cnn_architecture.json').read())
#Load model weights
model.load_weights('mnist_cnn_weights.h5')
#After loading the architecture and weights, you have to compile if you want to evaluate the model
model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])
#Model evaluation
score = model.evaluate(X_test.reshape(-1,28,28,1), Y_test, verbose=0)
print(score)

for i in range(test_num):
    #Specifying the file name
    img_path = str(i)+".jpg "
    #28 pixels x 28 pixels, import images in grayscale
    img = image.load_img(img_path, grayscale=True, target_size=(28, 28))
    #Convert the imported image to an array
    x = image.img_to_array(img)
    #Checking the array size
    print(x.shape)

    #Class prediction
    classified_label = model.predict_classes(x.reshape(-1,28,28,1))

    #Prediction result plot
    plt.subplot(2,5,i+1)
    plt.imshow(img, cmap='gray')
    plt.title("Class {0}".format(classified_label[0]))
    plt.axis('off')
plt.show()

Now, what happens to the result with a convolutional neural network? The execution result on Jupyter Notebook is uploaded to Gist.

The correct answer rate is 80%, but compared to the results of what I learned in MLP, in the case of the numbers I wrote, 6 and 8.7 and 1 have similar characteristics, so it may be easy to make a mistake. Also, if I add what I wrote in addition to the existing learning data, the answer may be correct, but I have no plans to do that.

in conclusion

This time, we classified the handwritten characters MNIST. Although the results of the learning and testing were good, the correct answer rate for the images I wrote was not good. Next time, I will talk about convolutional neural networks (CNN).

References

-[Deep Learning from scratch-Theory and implementation of deep learning learned with Python](http://www.amazon.co.jp/%E3%82%BC%E3%83%AD%E3%81%8B%] E3% 82% 89% E4% BD% 9C% E3% 82% 8BDeep-Learning-% E2% 80% 95Python% E3% 81% A7% E5% AD% A6% E3% 81% B6% E3% 83% 87 % E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83% 8B% E3% 83% B3% E3% 82% B0% E3 % 81% AE% E7% 90% 86% E8% AB% 96% E3% 81% A8% E5% AE% 9F% E8% A3% 85-% E6% 96% 8E% E8% 97% A4-% E5 % BA% B7% E6% AF% 85 / dp / 4873117585) -[Deep Learning (Machine Learning Professional Series)](http://www.amazon.co.jp/%E6%B7%B1%E5%B1%A4%E5%AD%A6%E7%BF%92-% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E3% 83% 97% E3% 83% AD% E3% 83% 95% E3% 82% A7% E3% 83% 83% E3% 82% B7% E3% 83% A7% E3% 83% 8A% E3% 83% AB% E3% 82% B7% E3% 83% AA% E3% 83% BC% E3% 82% BA-% E5% B2% A1% E8% B0% B7-% E8% B2% B4% E4% B9% 8B / dp / 4061529021 / ref = sr_1_3? S = books & ie = UTF8 & qid = 1485363071 & sr = 1-3 & keywords =% E6 % B7% B1% E5% B1% A4% E5% AD% A6% E7% BF% 92) -Keras Official Document -[Windows] Library Keras course that allows you to try Deep Learning immediately-Part 1

[PYTHON] [Windows] A library where you can try Deep Learning immediately Keras course-Part 2

Introduction

Execution environment

What is Multilayer Perceptron?

perceptron

Multilayer perceptron

Identification of handwritten characters

Learning

mnist_detect_class10.py

result

Experiment with numbers you wrote

my_mnist.py

Prediction result of your handwritten characters

Execution environment and source code during CNN learning

Execution environment

Source code at the time of learning

my_mnist_cnn_learning.py

Learning results on CNN

my_mnist_cnn.py

in conclusion

References

`mnist_detect_class10.py`

`my_mnist.py`

`my_mnist_cnn_learning.py`

`my_mnist_cnn.py`