[PYTHON] [TensorFlow] Least squares linear regression by gradient descent (stochastic descent)

TensorFlow: First step

TensorFlow is a scalable, multi-platform programming interface for implementing and executing machine learning algorithms ........

To summarize briefly, it seems to be a Package that creates and executes a processing flow called a calculation graph. PyTorch and Keras are catching up, but TensorFlow is still very popular.

キャプチャ.JPG

It's harder to debug and harder to understand than Keras. In fact, it seems that TensorFlow 2.0 has been released, which can be simply applied, probably because of the above points.

In this article, I will use the early TensorFlow grammar. Rest assured even if you have installed TensorFlow 2.0. You can convert with the following code ....!

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

The writing style is roughly as follows. 計算グラフ.JPG

Let's run it for the time being. If you install the TensorFlow package and try to execute it, you will often get an incomprehensible error, so mercilessly uninstall → install it.

python


import os
import numpy as np
import tensorflow as tf
  
g = tf.Graph()
 
with g.as_default():
    x = tf.placeholder(dtype=tf.float32, shape=(None), name='x') #argument
    w = tf.Variable(2.0, name='weight') #Variable ①
    b = tf.Variable(0.7, name='bias')   #Variable ②
    z=w*x + b
    init=tf.global_variables_initializer()     #Variable initialization(tf.You can define it in Session, but it is easier to handle here.)
  
with tf.Session(graph=g) as sess:
    sess.run(init)                             #Perform variable initialization
    for number in [1.0, 0.6, -1.8]:
        z = sess.run(z, feed_dict={x:number})  # z= w * x +Run b
        print(z)                               #Print the processing result

z = sess.run(z, feed_dict={x:number})

Only this part is briefly explained. sess.run (variable name you want to access, feed_dict = {argument: the value you want to pass to the argument})

Let's go steadily. Next, operate the array (tensor). In TensorFlow, the value flowing along the edge is called a tensor in the calculation graph. It's just a tensor flow. Tensors can be interpreted as scalars, vectors, and matrices. For example, the procedure is as follows. テンソル.JPG Now let's operate the tensor with TensorFlow.

python


g = tf.Graph()
with g.as_default():
    x = tf.placeholder(dtype=tf.float32, shape=(None,2,3), name='input_x') #Arguments that receive the tensor
    x2 = tf.reshape(x,shape=(-1,6),name='x2') #Transform the received argument x with the reshape method
     
    print(x2) #Output variable definition
     
    xsum=tf.reduce_sum(x2,axis=0,name='col_sum') #Total for each column
    xmean=tf.reduce_mean(x2,axis=0,name='col_mean') #Average for each column
 
with tf.Session(graph=g) as sess:
    x_array = np.arange(18).reshape(3,2,3) #Create an array
    print('Column Sums:\n', sess.run(xsum, feed_dict={x:x_array}))   #Output total for each column
    print('Column Means:\n', sess.run(xmean, feed_dict={x:x_array})) #Output the average of each column

tf.reshape(x,shape=(-1,6),name='x2') The point here is that -1 is specified for shape.

This means that the type definition is undecided and you should convert it according to the input array.

TensorFlow: Implementation of least squares linear regression

First of all, the least squares linear regression is from above. ① Calculate the predicted value using the formula y = w * x + b ② (Correct label-predicted value) ^ 2 ③ Find the average value of ② ④ Use ③ to find w and b ⑤ Repeat this for a few epochs

As a result, it can be said that the cost value has converged when it settles to the global minimum value. It looks like I tried to calculate manually with Excel below. エクセル手動.JPG

And a common figure. This shows how the cost value (yellow part of Excel) is plotted and converged. The cost function of a linear function is a differentiable convex function. 画像.png

Let's take a look at the actual code! !! First, prepare the training data

python


#Training data
X_train = np.arange(10).reshape((10, 1))
y_train = np.array([1.0, 1.3, 3.1,2.0, 5.0, 6.3, 6.6, 7.4, 8.0, 9.0])
  
#Linear regression model plot
plt.scatter(X_train, y_train, marker='s', s=50,label='Training Data')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()

Yes, I tried plotting with matplotlib! ぷろっと.png

Next, prepare the TfLinreg class.

python


class TfLinreg(object):
     
    #constructor
    def __init__(self, x_dim, learning_rate=0.01, random_seed=None):
        self.x_dim = x_dim
        self.learning_rate = learning_rate
        self.g = tf.Graph()
         
        with self.g.as_default():
            tf.set_random_seed(random_seed)
            self.build()
             
            #Variable initializer
            self.init_op = tf.global_variables_initializer()
     
    def build(self):
         
        #Define placeholder
        self.X = tf.placeholder(dtype=tf.float32, shape=(None,self.x_dim), name='x_input')
        self.y = tf.placeholder(dtype=tf.float32, shape=(None), name='y_input')
         
        # tf.zeros:Matrix with all 0 elements
        #1x1 tensor
        w = tf.Variable(tf.zeros(shape=(1)), name='weight')
        b = tf.Variable(tf.zeros(shape=(1)), name='bias')
         
        self.w = w
        self.b = b
         
        #Calculate the predicted value
        # tf.squeeze:A function that removes one dimension and lowers the tensor by one
        self.test = w * self.X + b
        self.z_net = tf.squeeze(w * self.X + b, name='z_net')
         
        #Actual value-Predicted value
        # tf.square:Take the square of each element
        sqr_errors = tf.square(self.y - self.z_net, name='sqr_errors')
        self.sqr_errors = sqr_errors
         
        #Cost function
        # tf.reduce_mean:A function that finds the average of the numbers in a given list
        self.mean_cost = tf.reduce_mean(sqr_errors, name='mean_cost')
         
        ##Create an optimizer
        # GradientDescentOptimizer:The steepest descent method
        optimizer = tf.train.GradientDescentOptimizer(
            learning_rate=self.learning_rate,
            name='GradientDescent'
        )
         
        #Loss function gradient(Weight and tilt)Calculate
        self.optimizer = optimizer.minimize(self.mean_cost)

At this point, we've just defined the class, so no specific numbers have been set. One point here. There are several types of gradient descent.

・ Gradient descent ・ Stochastic Gradient Descent (SDG) ・ Minibatch stochastic gradient descent (Minibatch SGD --MSGD)

In contrast to the steepest descent method, the parameters are updated after the sum of all errors is taken. The stochastic gradient descent method updates the weight for each piece of data. The mini-batch is an intermediate existence between the two, and the image is that a huge amount of data is cut and executed for each batch.

Then continue. First, call the constructor to create an instance.

python


#Model instantiation
lrmodel = TfLinreg(x_dim=X_train.shape[1], learning_rate=0.01)

Then carry out learning

python


###Learning
# self.optimizer
def train_linreg(sess, model, X_train, y_train, num_epochs=10):
    #Variable initialization
    sess.run(model.init_op)
    training_costs=[]
 
    #Same X_Repeat train 10 times
    for i in range(num_epochs):     
        """
        model.optimizer:Apply the rapid descent method
        model.X:Training data(Number of floors 2)
        model.y:Correct answer data(Number of floors 1)
        model.z_net:Predicted value(w * self.X +Calculated from b)
        model.sqr_errors:Actual value-Predicted square
        model.mean_cost:Mean of squared error
        model.w:Weight after update
        model.b:Bias after update
        """
        _,X,y,z_net,sql_errors,cost,w,b= sess.run([
            model.optimizer,
            model.X,
            model.y,
            model.z_net,
            model.sqr_errors,
            model.mean_cost,
            model.w,
            model.b,
        ],feed_dict={model.X:X_train, model.y:y_train}) #Repeat the same 10 times
         
        print('  ')
        print(X)
        print(y)
        print(z_net)
        print(sql_errors)
        print(cost)
        print(w)
        print(b)
         
        training_costs.append(cost)
         
    return training_costs

model.optimizer The steepest descent method is executed at.

The weight after the update is set for w and b, which is quite confusing. So the output is

[0.60279995] [0.09940001] However, the first prediction is carried out at [0] [0].

I will move it immediately.

python


sess = tf.Session(graph=lrmodel.g)
training_costs = train_linreg(sess, lrmodel, X_train, y_train)

Let's plot the cost value.

python


plt.plot(range(1,len(training_costs) + 1), training_costs)
plt.tight_layout()
plt.xlabel('Epoch')
plt.ylabel('Training Cost')
#plt.savefig('images/13_01.png', dpi=300)
plt.show()

コスト値.png

...! You've done it, it's converged! !!

Next, let's make a prediction. Prediction is not difficult because it only calls the predicted value (z_net). It is executed by inserting a second-rank tensor into the argument x_test.

python


###Forecast
# model.z_net
def predict_linreg(sess, model, X_test):
    y_pred = sess.run(model.z_net, feed_dict={model.X:X_test})
    return y_pred

Finally, let's visualize the model created with the training data.

python


###Linear regression model plot
#Training data
plt.scatter(X_train, y_train, marker='s', s=50,label='Training Data')
 
#Linear regression model output using training data
plt.plot(range(X_train.shape[0]), predict_linreg(sess, lrmodel, X_train),color='gray'
         , marker='o', markersize=6, linewidth=3,label='LinReg Model')
 
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()

プロット1.png

The linearity is drawn nicely ...........! So that's it for this time.

Recommended Posts

[TensorFlow] Least squares linear regression by gradient descent (stochastic descent)
Single regression analysis by least squares method
Linear regression based on Bayesian estimation (least squares estimation, maximum likelihood estimation, MAP estimation, Bayesian estimation)
Explain what is stochastic gradient descent by running it in Python
Deep Learning / Stochastic Gradient Descent (SGD) Simulation
Linear regression
Predicting Home Prices (Regression by Linear Regression (kaggle)) ver1.0