[PYTHON] Theano's basic notes

As a continuation of Blog Post, I would like to summarize a little more about Theano's details that I couldn't write on the slides.

First of all, Theano commentary is a concise expression of Theano features, so I recommend you to read it.

As it is written here, the feature of Theano is

And so on.

Theano's super-simplified tutorial

http://deeplearning.net/software/theano/tutorial/index.html#tutorial A rough summary of.

First, always import 3

import numpy
import theano
import theano.tensor as T

These three are promises.

All you need to know is

If you have a general understanding of the following, you should be able to read and understand the implementation of Deep Learning and make changes.

The story of variables (symbols)

Variables handled by Theano are handled by the concept of "tensor". The types and operations around the tensor are roughly defined under theano.tensor (as T).

I don't really understand tensors, but for the time being "** T. * Below are variable types and major general-purpose mathematical functions (exp, log, sin, etc.) **" about it.

Here, "** variable type **" is

there is. These are combined to represent the variable type (tensor type).

As a name,

And

And so on (there are others).

Combining these,

And so on. For example:

x = T.lscalar("x")
m = T.dmatrix()

These x and m are "symbols" and do not have actual values. This is a little different from ordinary Python variables.

See below for more information on variable generation. http://deeplearning.net/software/theano/library/tensor/basic.html#libdoc-basic-tensor

The operation result of the symbol is a symbol

For example

x = T.lscalar("x")
y = x*2
z = T.exp(x)

And execute. Since x is a symbol that has no value, y also has no value, and y is also a symbol that means "x * 2". z is also a symbol for exp (x). (Actually a Python Object)

The calculations that make up a neural network are also treated as a mass of operations (in short, an expression) between these symbols until a value is actually given. Since it is treated as an expression, it is easy for humans to understand, and I think that it will be possible to perform automatic differentiation, which will be described later, and to optimize it at runtime.

function: around the function

In order to actually perform the calculation, it is necessary to define a "function".

For example, if you want to make "** f (x) = x * 2 **"

f = theano.function([x], x*2)

And

y = x*2
f = theano.function([x], y)

And so on. f becomes a function, and when called

>>> f(3)
array(6)

It will be.

function is

Is specified. It seems that the function is compiled at this point, and even complex functions are executed at high speed.

function has the keyword ** gives **. As the name implies, givens works like "replace a symbol in an expression with another symbol or value".

For example

>>> x = T.dscalar()
>>> y = T.dscalar()
>>> c = T.dscalar()
>>> ff = theano.function([c], x*2+y, givens=[(x, c*10), (y,5)])
>>> ff(2)
array(45)

You can say that. Originally, the value to be calculated is "x * 2 + y", but the argument of the function itself is supposed to take the symbol "c". Actually, it cannot be calculated unless x and y are given, but it can also be calculated by giving the values of x and y in this givens part. This will be used in future Tutorials to partially use data in machine learning.

T.grad: Around the derivative

One of Theano's main features is this differentiation function. You can "analyze the formula to find the differentiated formula" called automatic differentiation.

For example

x, y = T.dscalars ("x", "y") # * How to write collectively z = (x+2*y)**2

If we differentiate the equation with respect to x, we get dz / dx = 2 (x + 2 * y).

gx = T.grad(z, x)

You can convert the expression with.

For the derivative with respect to y, dz / dy = 4 (x + 2 * y), but that

gy = T.grad(z, y)

You can convert the expression with.

When actually finding the value, it still needs to be functionalized,

fgy = theano.function([x,y], gy)
>>> fgy(1,2)
array(20.0)

And so on.

shared: shared variable

Variable = theano.shared (object)

You can declare shared data that can be referenced in the above ** function ** in the form of. For example

>>> x = T.dscalar("x")
>>> b = theano.shared(numpy.array([1,2,3,4,5]))
>>> f = theano.function([x], b * x)
>>> f(2)
array([  2.,   4.,   6.,   8.,  10.])

You can use it with. To reference and set the value of a shared variable

>>> b.get_value()
array([1,2,3,4,5])
>>> b.set_value([10,11,12])

And so on. It is immediately reflected in the function defined earlier, and when you execute ** f (2) ** again, you can see that the result has changed.

>>> f(2)
array([ 20.,  22.,  24.])

T.grad and shared variables and updates: typical gradient method implementation pattern

function has a keyword argument called ** updates **, which allows you to update shared variables.

For example, to set c as a shared variable and increment it by 1 each time the function f is executed, write as follows.

c = theano.shared(0)
f = theano.function([], c, updates= {c: c+1})

The part ** updated = {c: c + 1} ** represents the update of the value that is familiar in the programming language ** c = c + 1 **. When you do this, you get:

>>> f()
array(0)
>>> f()
array(1)
>>> f()
array(2)

These can be used to implement the gradient method. For example, for the data ** x = [1,2,3,4,5] **, find ** c ** that minimizes ** y = sum ((xc) ^ 2) **. I want to. The code is as follows, for example.

x = T.dvector("x") # input
c = theano.shared(0.) #I will update this. The initial value is 0 for the time being.
y = T.sum((x-c)**2)  # y=Value you want to minimize
gc = T.grad(y, c) #Partial derivative of y with respect to c
d2 = theano.function([x], y, updates={c: c - 0.05*gc}) #Updates c every time it runs and returns the current y

As a result, if you give ** [1,2,3,4,5] ** to ** d2 () ** several times,

>>> d2([1,2,3,4,5])
array(55.0)
>>> c.get_value()
1.5
>>> d2([1,2,3,4,5])
array(21.25)
>>> c.get_value()
2.25
>>> d2([1,2,3,4,5])
array(12.8125)
>>> c.get_value()
2.625

It will be. You can see that y gradually decreases and c gradually approaches "3".

Implementation of logistic regression

If you can understand this area, you will understand how logistic regression in the following tutorial works.

http://deeplearning.net/software/theano/tutorial/examples.html#a-real-example-logistic-regression

(Well, do you know from the beginning? ^^;)

Recommended Posts

Theano's basic notes
Basic syntax notes for shell script
JetBrains_Learning Notes_003
Numpy [Basic]
SQLAlchemy notes
pyenv notes
SQL notes
Pandas notes
Sphinx notes
django notes
Basic commands
Jupyter_Learning Notes_000
Theano's convolution
Django notes
Easy-to-understand C # notes for pythonist (only basic ones)