[PYTHON] [Roughly translate TensorFlow Tutorial into Japanese] 1. MNIST For ML Beginners

MNIST for machine learning beginners

Introduction

I will post a memorandum about TensorFlow, a library for deep learning that Google has started to provide. TensorFlow has detailed tutorial explanations, so I tried to translate it into Japanese. ・ About TensorFlow-> http://www.tensorflow.org/ ・ Original translation of this time-> http://www.tensorflow.org/tutorials/mnist/beginners/index.md Please note that since it is a genuine Japanese person, some translations may seem strange.

This Tutorial corresponds to Chapter 2 of Mr. Okaya's book "Deep Learning", so you may want to read it together.

Then, the model created this time is a model called Soft max Regression.

MNIST data

Read MNIST data.

import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

The download data is divided into two parts. 60,000 points of training data (mnist.train) and 10,000 points of test data (minist.test).

All MNIST data consists of two elements: a handwritten digital image and the corresponding label data. Here, the handwritten image is called xs and the label data is called ys. (ex, mnist.train.images and mnist.train.labels)

Each handwritten image 28 pixels x 28 pixels = 784 numbers In other words, it is a sequence of numbers (vector).

This corresponds to throwing away the two-dimensional information. The most difficult method also incorporates this two-dimensional information, but this time I will not consider it so far.

As a result of the above, mnist.train.images becomes a tensor (≒ n-dimensional matrix), and its form is [60000,784]. スクリーンショット 2015-11-21 2.02.40.png

The corresponding label data is represented as a "one-hot vector". ex, 0 -> [1,0,0,0,0,0,0,0,0,0,0] 3 -> [0,0,0,1,0,0,0,0,0,0,0] So mnist.train.labels is an array of [6000,10]. スクリーンショット 2015-11-21 2.08.38.png

Softmax regression

Softmax regression is a natural and simple model. Softmax is useful if you want to give probabilities to one target and each other. (Example: This image is 80% likely to be 9, 5% likely to be 8 ...)

To calculate that a given image belongs to a particular class, we calculate the weighted sum of the pixel values. That weight makes sense. The red part represents the "negative weight" and the blue part represents the "positive weight". (* Blue looks like a number ..?) スクリーンショット 2015-11-21 2.19.46.png

Run Regression

To do efficient python calculations, we usually use libraries such as Numpy that allow heavy calculations to be done outside of python. However, when I come back to python, it costs a lot of calculations. If you want to do GPU or different parallel computing, these are very unfavorable.

TensorFlow allows you to write all the graphs of related instructions completely outside of python, instead of having one heavy instruction done independently of python.

overhead .. Costs for managing hardware and programs

First, import tensorflow.

import tensorflow as tf

Describe the interrelated instructions as described above by manipulating symbolic variables.

x = tf.placeholder("float",[None,784])

x is not a specific value, it is called a placeholder. This is the value you enter when asking TensorFlow to perform the calculation.

The input numbers in the MNIST image have been converted to 784-dimensional vectors. These are expressed as [None, 784] two-dimensional tensors.

The model also needs weight and biases. Variable is used in such cases. Variables are mutable tensors that "live" in the graph of interacting instructions in TensorFlow.

w = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

This time, I made W and b as Variable that has only 0. We will learn W and b from now on, so it doesn't really matter what the initial value was.

Now you can run the model!

y = tf.nn.softmax(tf.matmul(x,W)+b)

It seems that tf.matmul (x, W) is multiplied by x and W, b is added to it, and the result is assigned to the Softmax function by tf.nn.softmax. The reason why x is a two-dimensional tensor is that it can be multiplied by W. (Probably matmul = matrix multiple)

Learning

An error function is used to quantitatively evaluate what is good modeling as machine learning. This time, we use an error function called cross-entropy. スクリーンショット 2015-11-21 14.13.33.png Here, y is the probability distribution of the result predicted this time, and y'is the true distribution.

In order to run cross-entropy, we first need to add a new placeholder to enter the correct answer.

y_ = tf.placeholder("float",[None,10])

And you can do cross-entropy.

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

tf.log (y) calculates the logarithm of each element of y, calculates the multiplication with each value of y_, and finally tf.reduce_sum adds all the elements of the tensor.

Learning is easy with TensorFlow. TensorFlow knows all the graphs of its calculations, so it can automatically use the backpropagation algorithm. The backpropagation algorithm efficiently reveals how variables affect the error function to be minimized.

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

This time, we are instructing TensorFlow to minimize cross_entropy at a learning rate of 0.01 using the Gradient descent algorithm. Gradient descent is a simple procedure, and TensorFlow simply changes each variable slightly in the direction of smaller error. But TensorFlow has a lot of optimization algorithms, and it's very easy to use them.

Only one instruction must be added to initialize the set variables before training the model. (* Not yet executed!)

init = t.initialize_all_bariables()

Finally, you can start the session. Let's execute the instruction to initialize the variable.

sess = tf.Session()
sess.run(init)

And repeat the training 1000 times!

for i in range(1000):
   batch_xs, batch_ys = mnist.train.next_batch(100)
   sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

At each step of the loop, we get a "batch" of 100 random data points from the training dataset we prepared. Then execute train_step, give each batch data, and change the placeholder.

The technique of using "small batches" of random data is called stochastic training, in this case stochastic gradient descent.

Evaluation of the created model

First of all, figure out what you have predicted correctly. tf.argmax is a very useful function that returns the value of which the tensor has the largest element along some axes. For example, tf.argmax (y, 1) returns the most probable label for each input, and tf.argmax (y_, 1) returns the correct label. And tf.equal can be used to determine if our prediction was correct.

correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))

The result is a list of Boolean values. Convert it to a Float number for ease of use and take the average. For example, [True, False, True, True] becomes [1,0,1,1] and the average is 0.75.

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

Finally, we ask the prediction correct answer rate for the test data

print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})

This result will be around 91%.

Recommended Posts

[Roughly translate TensorFlow Tutorial into Japanese] 1. MNIST For ML Beginners
[Roughly translate TensorFlow Tutorial into Japanese] 2. Deep MNIST For Experts
TensorFlow Tutorial MNIST For ML Beginners
TensorFlow Tutorial -MNIST For ML Beginners
Conducting the TensorFlow MNIST For ML Beginners Tutorial
[Explanation for beginners] TensorFlow tutorial MNIST (for beginners)
[Explanation for beginners] TensorFlow tutorial Deep MNIST
Supplementary notes for TensorFlow MNIST For ML Beginners
I tried a TensorFlow tutorial (MNIST for beginners) on Cloud9-Classification of handwritten images-
Code for TensorFlow MNIST Begginer / Expert with Japanese comments
I tried running the TensorFlow tutorial with comments (_TensorFlow_2_0_Introduction for beginners)
[Deprecated] Chainer v1.24.0 Tutorial for beginners
TensorFlow Deep MNIST for Experts Translation
I tried the TensorFlow tutorial MNIST 3rd
INSERT into MySQL with Python [For beginners]
Django tutorial summary for beginners by beginners ③ (View)
Beginners read "Introduction to TensorFlow 2.0 for Experts"
Django tutorial summary for beginners by beginners ⑤ (test)