[PYTHON] [Introduction to Tensorflow] Understand Tensorflow properly and try to make a model

This article is a translation of my blog, Data Science Struggle.

Purpose

Get a rough idea of what Tensorflow is and create a simple model.

Overview of Tensorflow

Let's take a brief look at what Tensorflow is, which I've come to hear in various places, and how it's positioned in machine learning.

  1. What is Tensorflow? First of all, Tensorflow can be thought of as a calculator *** that automatically performs calculations to determine the parameters of the *** neural network when actually using it. The *** calculator *** here literally does the calculation, such as giving a 1 + 1 answer.
  2. What is your position in machine learning? There are three main methods of using machine learning. The first is a pattern in which you create your own model in each language after understanding the algorithms and mathematics correctly. The second is a pattern that creates a model in a few lines using sklearn for python and each library according to the method for R. The third is a pattern that you create yourself using a calculation library suitable for creating machine learning models.

*** Tensorflow is the third one. *** ***

Here, I will briefly describe the actual situation regarding the above three. If you have actually learned the mathematics of machine learning, I would like to create a model once without depending on the library etc. for confirmation of the mathematics and mechanism and more flexible rewriting. However, it is often impractical to create all the models in that way for each and every situation, and reinventing the wheel is wasteful. Therefore, in reality, you will learn how to use the library for machine learning and deal with the problem. Most libraries are very useful, and you can achieve your goal in just a few lines if you just want to model. However, there may be situations where such a library is difficult to handle. Neural networks are one example. Not only the number of intermediate layers, but also the method of joining and data interrupts, anything can be done as long as specific rules are followed. It's difficult to use machine learning libraries to create such things, but there's so much to worry about to make them all yourself.


Computational libraries for machine learning, such as Tensorflow, solve the above situation. In the first place, it is not a machine learning library that creates a model in one line. *** A library that helps users build their own models ***. In other words, it is different from a machine learning library that can be used without knowing the mechanism at all, and it is difficult to use it without at least knowledge of how to build a model.

How to calculate Tensorflow

Let's take a look at how Tensorflow actually does the calculations. First, specifically, the following calculation is performed by Tensorflow.

5 + 3

If you solve this with Tensorflow, it becomes like this.

add.py


import tensorflow as tf
a = tf.constant(5)
b = tf.constant(3)
added = tf.add(a, b)

with tf.Session() as sess:
    print sess.run(added)

Here, if you try to print added normally,

Tensor("Add_1:0", shape=(), dtype=int32)

Is output. If interpreted properly, added is not a concrete number, but the form information of the calculation that is the sum of a and b, and the actual calculation is done when it is run in the Session. There will be. This *** form information *** is called *** graph *** etc. in Tensorflow.

Tensorflow variables

We saw a simple Tensorflow calculation method above, but there is one more thing to worry about. That is a variable. Since it is a calculation library used in machine learning, it is essential to handle multiple dimensions at once, and as mentioned above, after creating *** graph ***, perform actual calculation by run. It is convenient to be able to specify the value to be given to the variable later. In fact, Tensorflow has that feature. In summary, *** Tensorflow has multidimensional, constants, and temporary variables that accept assignments later ***. Variables can be defined as follows.

#constant
a = tf.constant(3)

#variable
b = tf.Variable(0)

#Placeholder
c = tf.placeholder(tf.float32)

Specifically, if you try to calculate using placeholders

a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
added = tf.add(a, b)
with tf.Session() as sess:
    print sess.run(added, feed_dict = {a: 3.0, b: 5.0})

In the above example, up to added is a graph, and a specific value is given by feed_dict in the run phase to perform the calculation.

Classification by Tensorflow

Let's actually classify iris data using tensorflow. From this point onward, the entire phase will be divided into two parts: *** blueprint creation *** and *** build ***. *** Blueprint creation *** is the part of what kind of model and graph to create. *** Build *** refers to defining the parameters in the blueprint by entering the data into the blueprint above. Below is a code example of the iris classification. In addition, in the data https://dl.dropboxusercontent.com/u/432512/20120210/data/iris.txt
Use the one downloaded from.

get_data.py


import pandas as pd
import tensorflow as tf
from sklearn import cross_validation

data = pd.read_csv('https://dl.dropboxusercontent.com/u/432512/20120210/data/iris.txt', sep = "\t")
data = data.ix[:,1:]
train_data, test_data, train_target, test_target = cross_validation.train_test_split(data.ix[:,['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width']], data.ix[:,['Species']], test_size = 0.4, random_state = 0)

With the above code, the iris data is acquired and divided into training data and test data. In the future, we will create a model using the paired data of train_data and train_target. The answer when train_data is given is train_target. Of course, since it is divided into train and test, it should be done with test data, but this time it is omitted. From this, we will actually create a blueprint and build based on it.

classify.py


#Blueprint creation
#Placeholder settings
X = tf.placeholder(tf.float32, shape = [None, 4])
Y = tf.placeholder(tf.float32, shape = [None, 3])

#Parameter setting
W = tf.Variable(tf.random_normal([4, 3], stddev=0.35))

#Activation function
y_ = tf.nn.softmax(tf.matmul(X, W))

#Build
#Loss function
cross_entropy = -tf.reduce_sum(Y * tf.log(y_))

#Learning
optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(cross_entropy)

#Run
with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    for i in range(1000):
        x = train_data
        y = pd.get_dummies(train_target)
        print(sess.run(W))

        sess.run(train, feed_dict = {X: x, Y: y})

    test = sess.run(W)

Let's look at each one. First, create a blueprint. When the matrix of the explanatory variables is X, the matrix of the explained variables is Y, and the weight is W, this blueprint can be expressed as follows. The bias term is not used this time.

X = [None, 4]\\
W = [4, 3]\\
Y = [None, 3]\\
[None, 4] \times [4, 3] = [None, 3]\\

[None, 3] in the above equation matches Y. Actually move on to the code.

placeholder.py


#Placeholder settings
X = tf.placeholder(tf.float32, shape = [None, 4])
Y = tf.placeholder(tf.float32, shape = [None, 3])

Set a placeholder. X and Y set here point to the explanatory variable and the explained variable in the data, respectively, and the data will be entered in the part after ***. The iris data has four explanatory variables and three classes to be classified. In addition, since the number given when feeding data (the number of rows in the data frame) is unknown, put None there.

parameter.py


#Parameter setting
W = tf.Variable(tf.random_normal([4, 3], stddev=0.35))

Set the parameter part, that is, the weight. In the build part, this parameter W will be updated and confirmed by inputting data. In the tf.random_normal and stddev parts, random numbers that follow the specified distribution are given as initial values. There are several ways to give them, so you should look at each one.

activate.py


#Activation function
y_ = tf.nn.softmax(tf.matmul(X, W))

Determine the activation function. It is necessary to select an appropriate function according to the shape of the output, whether it is an intermediate layer or an output layer. In this case, this y_ is the final output for the input.

Beyond this is the build part. Tensorflow does the tricky calculations well, but you need to understand the composition of what you're doing. If you describe the composition of the build part appropriately, it means that *** a loss function is defined and parameters are updated so that the loss becomes smaller ***. The accuracy of the prediction depends on the value of the parameter W defined above. Change the parameters so that the prediction accuracy is as high as possible. At that time, instead of focusing on *** how accurate ***, we focused on *** how few mistakes *** and updated the parameters so that there were fewer mistakes. It means going.

lost.py


#Loss function
cross_entropy = -tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_, Y)

This loss function is the part of *** how few mistakes mentioned above ***. The loss function differs depending on whether the model to be created is classification or regression. This should be investigated because Tensorflow has multiple loss functions.

train.py


#Learning
optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(cross_entropy)

The optimizer defines how to actually update the parameters based on the data. In train, the loss function is made small (parameter update) based on the update method specified by optimizer.

execute.py


#Run
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(1000):
        x = train_data
        y = pd.get_dummies(train_target)
        print(sess.run(W))

        sess.run(train, feed_dict = {X: x, Y: y})

Make a concrete build here. This time, I skipped it and read all the data 1000 times as one unit. In the part of sess.run (), concrete data is given to the placeholder part by feed_dict. In this way, the parameter that locally minimizes the loss function is searched, and W is updated. In reality, it is often the case that the degree of loss is written down.

Recommended Posts

[Introduction to Tensorflow] Understand Tensorflow properly and try to make a model
Try to make a kernel of Jupyter
Try TensorFlow RNN with a basic model
[Introduction to SEIR model] Try fitting COVID-19 data ♬
Make a chatbot and practice to be popular.
Try to make a "cryptanalysis" cipher with Python
Try to make a dihedral group with Python
I tried to make a ○ ✕ game using TensorFlow
WEB scraping with python and try to make a word cloud from reviews
I tried to make a simple image recognition API with Fast API and Tensorflow
Try to make a Python module in C language
Try to make a command standby tool with python
Machine learning beginners try to make a decision tree
Make a Sato Yohei discriminator using OpenCV and TensorFlow
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
Collect typographical information from Toshiyuki Sakamoto's "Make and Understand! Introduction to Ensemble Learning Algorithms"
A quick introduction to pytest-mock
An introduction to private TensorFlow
A super introduction to Linux
Try to understand Python self
Try to bring up a subwindow with PyQt5 and Python
[Introduction to infectious disease model] I tried fitting and playing ♬
A note about TensorFlow Introduction
Try to model a multimodal distribution using the EM algorithm
Try to select a language
Just try to receive a webhook in ngrok and python
A story about porting the code of "Try and understand how Linux works" to Rust
Creating a Tensorflow Sequential model with original images added to MNIST
Introduction to Nonparametric Bayes 2 (Indian Buffet Process and Latent Feature Model)
Try to edit a new image using the trained StyleGAN2 model
[Introduction to Systre] Exchange rates and stocks: Behavior during a crash ♬
Spigot (Paper) Introduction to how to make a plug-in for 2020 # 01 (Environment construction)
I tried to make a periodical process with Selenium and Python
How to make a unit test Part.1 Design pattern for introduction
[Introduction] I want to make a Mastodon Bot with Python! 【Beginners】
Try to make a blackjack strategy by reinforcement learning ((1) Implementation of blackjack)
Try to implement and understand the segment tree step by step (python)
How to make a Japanese-English translation
Try to draw a Bezier curve
[Introduction to Python3 Day 1] Programming and Python
Introduction to TensorFlow --Hello World Edition
Make a face recognizer using TensorFlow
How to make a slack bot
How to make a crawler --Advanced
How to make a recursive function
How to convert Tensorflow model to Lite
How to make a deadman's switch
[Blender] How to make a Blender plugin
A light introduction to object detection
How to make a crawler --Basic
Make a model iterator with PySide
Try to make a nervous breakdown application with Vue.js and Django-Rest-Framework [Part 4] ~ MySQL construction and DB migration with Docker ~
Introduction and usage of Python bottle ・ Try to set up a simple web server with login function
2. Make a decision tree from 0 with Python and understand it (2. Python program basics)
[Introduction to Python] What is the difference between a list and a tuple?
How to make a model for object detection using YOLO in 3 hours
Try to infer using a linear regression model on android [PyTorch Mobile]
Try to make a capture software with as high accuracy as possible with python (2)
Try to make foldl and foldr with Python: lambda. Also time measurement
I learned scraping using selenium to make a horse racing prediction model.
I came up with a way to make a 3D model from a photo.