Introduction

I tried "MNIST For ML Beginners" which is written for beginners of machine learning in the TensorFlow tutorial. So I implemented MNIST's number recognition, which is written as Hello World of machine learning. MNIST is a data set of handwritten images. Here, images of handwritten numbers from 0 to 9 are read and classified by opportunity learning. The sample code on Github of TensorFlow seems to be difficult, but when I tried only the necessary implementation, it was really simple and I could implement it in 20 lines.

environment

Cloud9 Python 2.7.6 Sample Codes : GitHub Environment construction is "Use TensorFlow in cloud integrated development environment Cloud9 ~ GetStarted ~" The basic usage of TesorFlow is "Use TensorFlow in cloud integrated development environment Cloud9-Basics of usage-" See

Learning process code

It is located on Github, but the following is the implemented code.

`mnist_softmax_train.py`


from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# Download gz files to MNIST_data directory
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

# Initializing
sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None, 28*28])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

W = tf.Variable(tf.zeros([28*28, 10]), name="W")
b = tf.Variable(tf.zeros([10]), name="b")

sess.run(tf.initialize_all_variables())

# Making model
y = tf.matmul(x, W) + b
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))

# Training
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
for i in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

# Evaluating
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

# Save train data
saver = tf.train.Saver()
saver.save(sess, 'param/softmax.param')

When I run it, it seems that the accuracy is about 92% as described in the tutorial.

The contents of the code

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

I'm getting the data. I created a directory called MNIST_data and downloaded 4 files there. The contents are a black box, so you need to check it.

sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None, 28*28])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

W = tf.Variable(tf.zeros([28*28, 10]), name="W")
b = tf.Variable(tf.zeros([10]), name="b")

sess.run(tf.initialize_all_variables())

This is the initialization process. x holds the image data (28 pixels x 28 pixels) and y_ holds the label (0-9) of the image. Since it is a placeholder, I will assign it later. W and b are parameters, where W is the weight of each pixel in the image and b is the intercept.

y = tf.matmul(x, W) + b
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))

y is an expression that calculates the predicted value from the parameter and x. The following equation is the softmax regression. Logistic regression is a model that predicts 0 and 1, but it is extended to be able to predict multiple labels (0-9 in this case). The tutorial also included detailed formulas, so I think you need to study that as well.

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
for i in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

Gradient Descent Optimizer is a stochastic gradient descent method. The first downloaded data with mnist.train.next_batch is acquired in order and assigned to the variable with feed_dict = {x: batch [0], y_: batch [1]}. We are looking for the optimal solution by repeatedly executing it.

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

We are using test data to check the accuracy of y (predicted value) and y_ (actual value).

saver = tf.train.Saver()
saver.save(sess, 'param/softmax.param')

This is the process of saving the learned parameters. You can now classify the images later using only the parameters.

Predicting your handwritten data

You learned the handwritten numerical data above. Using that parameter, I tried to classify the handwritten data I wrote. My handwritten data is also on Github, but it's a black and white bmp. I'm loading it with parsebmp.py. The MNIST data is learned with a numerical value from 0 to 1, but this process is easy with only 0 and 1 data. When I actually moved it, it was 60% accurate. It's far from 92%, but it can be said that it can be judged to some extent. ⇒ Preprocessing was required to predict handwritten data. For more information, please refer to the article Predicting your handwritten data with TensorFlow.

`mnist_softmax.py`


from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import sys
import numpy as np
import parsebmp as pb

def whatisit(file, sess):
  print("File name is %s" % file)
  data = pb.parse_bmp(file)

  # Show bmp data
  for i in range(len(data)):
    sys.stdout.write(str(int(data[i])))
    if (i+1) % 28 == 0:
      print("")

  # Predicting
  d = np.array([data])
  result = sess.run(y, feed_dict={x: d})

  # Show result
  print(result)
  print(np.argmax(result, 1))


if __name__ == "__main__":
  # Restore parameters
  x = tf.placeholder(tf.float32, shape=[None, 28*28])
  y_ = tf.placeholder(tf.float32, shape=[None, 10])
  W = tf.Variable(tf.zeros([28*28, 10]), name="W")
  b = tf.Variable(tf.zeros([10]), name="b")
  y = tf.matmul(x, W) + b
  
  sess = tf.InteractiveSession()
  saver = tf.train.Saver()
  saver.restore(sess, 'param/softmax.param')

  # My data
  whatisit("My_data/0.bmp", sess)
  whatisit("My_data/1.bmp", sess)
  whatisit("My_data/2.bmp", sess)
  whatisit("My_data/3.bmp", sess)
  whatisit("My_data/4.bmp", sess)
  whatisit("My_data/5.bmp", sess)
  whatisit("My_data/6.bmp", sess)
  whatisit("My_data/7.bmp", sess)
  whatisit("My_data/8.bmp", sess)
  whatisit("My_data/9.bmp", sess)

in conclusion

First, I implemented the code while watching the tutorial. It's really easy to implement, but I think you need to understand the following points.

--Contents of usage data --Detailed formula for softmax regression

Also, in order to really understand it, I would like to be able to take in my own handwritten data and make predictions. I think that will lead to practical use. ⇒ I tried to predict my handwritten data on 2017/03/27.

Change log

--2018/06/12: Added about prediction of handwritten data --2017/03/27: Added how to save parameters and predicted contents of your own handwritten data --2016/10/30: New post

[PYTHON] I tried a TensorFlow tutorial (MNIST for beginners) on Cloud9-Classification of handwritten images-