I tried TensorFlow

First, let's read Get started. Introduction | TensorFlow

Suddenly, a sample code of TensorFlow comes out, but here it shows how simple it is to build a model using sample data. If you don't have any knowledge of machine learning in the first place, you won't understand the meaning.

If you look at various TensorFlow articles, you may find that you can do it with zero machine learning knowledge, but since TensorFlow is just a tool, I think it is better to have the minimum knowledge for the future.

Next, you are advised to download the data by working on traditional MNIST (Handwriting Recognition). If you have no experience with MNIST, we recommend "blue medicine", and if you are accustomed to machine learning to some extent, "red medicine" is recommended, so don't hesitate to get "red medicine" here.

Because I studied machine learning for a month.

So, I noticed here that I said "Let's try MNIST", but I haven't even installed TensorFlow yet. This Get started is not in very good order. The installation and explanation of the first sample code seems to be enough for a Qiita 1 article, so MNIST will do it at another time.

TensorFlow install on Mac OS X

(reference) Download and Setup | TensorFlow

It says that if you use GPU ..., but now I want to focus on studying machine learning, so I will avoid stumbling on other environment construction and will not create a GPU environment. I will skip everything around the GPU environment construction.

I did so because it seemed easy to prepare in the environment of Virtualenv that I have now.

% virtualenv --system-site-packages ~/env/tensorflow

In the installation procedure of the site, the directory where the environment settings are stored was ~ / tensorflow, but I have summarized the environment in ~ / env, so I did it as above. Please change as appropriate according to your environment.

% source ~/env/tensorflow/bin/activate

Now your console looks like this, and you're ready to prepare your environment for tensorflow without polluting other environments. We will install tensorflow in this.

(tensorflow) %

Follow the instructions on the site

(tensorflow) % export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.12.1-py3-none-any.whl
(tensorflow) % pip3 install --upgrade $TF_BINARY_URL

This completes the installation. Check the operation below and if there is no problem, it is OK.

(tensorflow) % python
...
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))
Hello, TensorFlow!
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> print(sess.run(a + b))
42

Explanation of linear regression

Now, the sample code suddenly presented by Get started was a simple linear regression code. I would like to explain this in detail while explaining linear regression.

Since it is short, I have posted the entire code.

import tensorflow as tf
import numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

# Before starting, initialize the variables.  We will 'run' this first.
init = tf.global_variables_initializer()

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
for step in range(201):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(W), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

# Close the Session when we're done.
sess.close()

There are comments in the code, but I think it's better to be a little more detailed, so I'll explain that as well.

What you are trying to do with this code

Prepare about 100 points on the straight line represented by $ y = \ frac {1} {10} x + \ frac {3} {10} $.
Perform linear regression using it as training data
Define a model in the form $ y = W x + b $ and perform a linear regression using TensorFlow to get the parameters $ W $ and $ b $.
Make sure that $ W $ and $ b $ are close to $ \ frac {1} {10} (= 0.1) $ and $ \ frac {3} {10} (= 0.3) $, respectively.

Code commentary

Import required libraries

import tensorflow as tf
import numpy as np

Preparation of training data

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.1 + 0.3

Originally, if linear regression is to be performed, training data is prepared from actual data, but since this is a tutorial, about 100 training data whose distribution can be predicted are generated and prepared in advance. First, we prepare 100 points of x using numpy's random function and make them x_data, and we also prepare 100 points of y_data on the graph corresponding to them. In other words, the training data does not vary and is all on $ y = \ frac {1} {10} x + \ frac {3} {10} $.

If we model this in the form of $ y = W x + b $, $ W = 0.1 $ and $ b = 0.3 $ are self-explanatory, but we call it "Linear regression with TensorFlow using these points as training data." To do] That is the purpose of this tutorial.

Model definition of linear regression

# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

Where W = ... gives the initial value of W. Where b = ... gives the initial value of b.

In neural networks, etc., it is necessary to give W a little random number to break the symmetry, but in linear regression, the initial value of W can be tf.zeros ([1]) as well as b ([1]). The result is the same). This time, the sample code is used as it is, so it is initialized with random numbers.

y = ... is a set of y points output by this model that corresponds to x_data. In other words, it is a hypothetical function.

y = h_\theta(x)

And $ h_\theta(x) = W x + b $ Is that.

Definition of cost function

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

Now, we want to select $ W $ and $ b $ so that the prediction of the model above has the least error with the actual data, so we first define the error. The error at a point is defined by the square of the difference between the predicted data y and the actual data y_data, and the average of the errors at all points is defined as loss.

Let's reconfirm the cost function of linear regression here.

J(\theta) = \frac{ 1 }{ m }\sum_{ i = 1 }^{ m }\frac{ 1 }{ 2 }(h_\theta(x^{ (i) }) - y^{ (i) })^2

The loss is the python representation of this $ J (\ theta) $ using TensorFlow. ($ \ Frac {1} {2} $ is omitted because it is not relevant for minimization)

Next, we define the optimizer. This is where you decide what algorithm to use to minimize costs. Right now, we have specified GradientDescentOptimizer by using the steepest descent method among the algorithms provided in TensorFlow. The argument 0.5 is the learning rate. Originally, I should try various values, but this time around 0.5, it works well, so I specified 0.5.

Click here for learning rate (= learning coefficient) [Practical level of machine learning in one month # 4 (Linear regression)](http://qiita.com/junichiro/items/d4dc67bdebb9fccb7ef5#%E5%AD%A6%E7%BF%92%E4%BF % 82% E6% 95% B0)

You can find out about the pre-prepared Optimizer by looking at the documentation. Optimizers | TensorFlow

And finally, this optimizer tells us to minimize the loss.

Initialization before execution

# Before starting, initialize the variables.  We will 'run' this first.
init = tf.global_variables_initializer()

In fact, TensorFlow doesn't do anything with the code so far. All you have to do is make various settings and prepare. After that, learning actually runs by executing run, but before that, initializing various values is what we are doing here.

By the way, if it is a little old introductory article, please note that there are cases where ʻinitialize_all_variables` is used here. It's Deprecated, so there's no reason to use it again.

(Reference) TensorFlow initialize_all_variables became Deprecated

Execution phase

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
for step in range(201):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(W), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

# Close the Session when we're done.
sess.close()

The last is the execution phase. TensorFlow learns in Session according to the above settings. In this example, the iteration count is set to 200 (range (201)), and the cost function minimization process preset to train is repeatedly executed. Output to the console every 20 steps so that you can check the status.

result

0 [ 0.17769754] [ 0.34861392]
20 [ 0.1106104] [ 0.29447961]
40 [ 0.10272982] [ 0.29857972]
60 [ 0.10070232] [ 0.29963461]
80 [ 0.10018069] [ 0.29990602]
100 [ 0.1000465] [ 0.29997581]
120 [ 0.10001197] [ 0.29999378]
140 [ 0.10000309] [ 0.2999984]
160 [ 0.1000008] [ 0.29999959]
180 [ 0.10000021] [ 0.29999989]
200 [ 0.1000001] [ 0.29999995]

It's not very interesting, but by repeating the iterations, you can see that it gradually converges to $ W = 0.1 $ and $ b = 0.3 $.

Once you know this, I think it's not too difficult to apply the general linear regression problem. You will be able to use it immediately. I think that dealing with classification problems such as logistic regression can be dealt with without problems by just adding another twist to the cost function.

(Reference) Making machine learning to practical level in one month # 7 (Classification problem: Logistic regression # 1)

If possible, I would like to practice it with a neural network. Neural networks will probably be used in "traditional MNIST (Handwriting Recognition)", which I didn't do this time, so I'd like to see it as soon as possible.

Postscript: I wrote it! Introduction to TensorFlow ~ Petit application of MNIST for beginner ~ I tried the TensorFlow tutorial "Traditional MNIST (Handwriting Recognition)" with a little extension of my own knowledge. In the tutorial, the softmax function is used for the cost function and the stochastic gradient descent method is used for minimization, but you can understand by replacing them with the one using the sigmoid function or the one using the steepest descent method. I'm deepening.

reference

Other TensorFlow introductory articles that I referred to

from now on

Try a neural network with traditional MNIST (Handwriting Recognition)
I want to see how much the speed improves with the GPU version
I want to try it on AWS
(Reference) Running TensorFlow on AWS GPU instance
I want to adapt to practical tasks
Automatic classification of news article themes
Spam judgment or usefulness prediction of comments attached to articles
AI for Go (9th roadbed)

I would like to do this kind of thing in the future.

[PYTHON] Probably the most straightforward introduction to TensorFlow