What is PyTorch

Overview

Python's open source machine learning library. PyTorch defines a class called ** Tensor ** (torch.Tensor), which is used for saving and computing multidimensional arrays. It's similar to Numpy's array Array, but it also allows CUDA-enabled Nvidia to operate on GPUs. [Source]-> PyTorch-Wikipedia

Differences from other libraries

Machine learning libraries are roughly divided into two types: ** Define by Run ** and ** Define and Run **.

Define by Run Define the network while running. Since the network can be changed dynamically, flexible design is possible. For example, it is possible to switch networks according to the size of data and change the design for each iteration. Well-known libraries include PyTorch and Chainer.
Define and Run Define the network first and then execute. You can easily configure a network just by combining parts like a Lego block. Concise and easy to understand. Well-known libraries include Keras and Tensorflow.

PyTorch belongs to the Define by Run machine learning library that builds a network while executing. Because of these differences, it seems best to use these libraries in the right place. Is Keras of Define and Run simple in normal data analysis work, and PyTorch of Define by Run superior in research and difficult tasks that require detailed design?

[Reference source]-> "Introduction to PyTorch" How to use & what is the difference from Tensorflow, Keras, etc.? --Proclassist

PyTorch and Chainer

The biggest difference between PyTorch and Chainer is that PyTorch is widely used in overseas communities, while Chainer is mainly in Japan. This is because Chainer was developed by a company called Preferred Networks (PFN) from Japan. However, in December 2019, PFN announced that it would end the major update of Chainer and move to research and development of PyTorch, which changed the relationship between the two libraries. For details, see here. Therefore, if you want to use the Define by Run machine learning library in the future, it is safe to select PyTorch.

Getting Started with PyTorch

Check the PyTorch specifications while quoting the official PyTorch tutorial.

How to use Tensor (Part 1: Definition / Calculation)

What is PyTorch? -- PyTorch Tutorials 1.4.0 documentation

The Tensor used in PyTorch is similar to Numpy's ndarray, but the Tensor can be calculated using the GPU for faster computation. The following is a summary of how to use PyTorch's Tensor in comparison with Numpy.

Library import

import torch
import numpy as np

Array definition

An array with all zeros.

# Tensor
x_t = torch.zeros(2, 3)
# Numpy
x_n = np.zeros((2,3))

print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)

# ---Output---
#Tensor:
# tensor([[0., 0., 0.],
#        [0., 0., 0.]]) 
#Numpy:
# [[0. 0. 0.]
# [0. 0. 0.]]

An array with all elements 1.

# Tensor
x_t = torch.ones(2,3)
# Numpy
x_n = np.ones((2,3))

print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)

# ---Output---
#Tensor:
# tensor([[1., 1., 1.],
#        [1., 1., 1.]]) 
#Numpy:
# [[1. 1. 1.]
# [1. 1. 1.]]

An array that specifies the values of the elements.

# Tensor
x_t = torch.tensor([[5,3],[10,6]])
# Numpy
x_n = np.array([[5,3],[10,6]])

print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)

# ---Output---
#Tensor:
# tensor([[ 5,  3],
#        [10,  6]]) 
#Numpy:
# [[ 5  3]
# [10  6]]

An array in which element values are specified by random numbers.

# Tensor
x_t = torch.rand(2,3)
# Numpy
x_n = np.random.rand(2,3)

print('Tensor:\n',x_t,'\n',x12_t,'\n')
print('Numpy:\n',x_n,'\n',x12_n)

# ---Output---
#Tensor:
# tensor([[0.5266, 0.1276, 0.6704],
#        [0.0412, 0.5800, 0.0312]]) 
# tensor(0.3370) 
#Numpy:
# [[0.08877971 0.51718009 0.99738679]
# [0.35288525 0.68630145 0.73313903]] 
# 0.1799177580940461

Get elements of an array

Access to each element of the array can be done like x [0,1] (this will get the elements in the 1st row and 2nd column of the array x).

# Tensor
x12_t = x_t[0,1]
# Numpy
x12_n = x_n[0,1]

print('Tensor:\n',x12_t,'\n')
print('Numpy:\n',x12_n)

# ---Output---
#Tensor:
# tensor(0.1276) 
#Numpy:
# 0.5171800941956144

It should be noted here that Numpy gets a numerical value when getting an array element, but PyTorch gets a Tensor instead of a numerical value. Therefore, PyTorch cannot treat the elements of the array extracted in this way as they are as a scalar quantity. If you want to retrieve numbers like Numpy, you need to execute Tensor.item ().

x12_value = x12_t.item()
print(x12_t)
print(x12_value)

# ---Output---
# tensor(0.1276)
# 0.12760692834854126

Four arithmetic operations

With PyTorch, you can perform four arithmetic operations with the same feeling as Numpy.

# Tensor
x_t = torch.Tensor([1,2,3])
y_t = torch.Tensor([2,2,2])
add_t = x_t + y_t
sub_t = x_t - y_t
mul_t = x_t * y_t
div_t = x_t / y_t
print('Tensor:\nAddition:\n',add_t,'\nSubtraction:\n',sub_t,
'\nMultiplication:\n',mul_t,'\nDivision:\n',div_t,'\n')

# Numpy
x_n = np.array([1,2,3])
y_n = np.array([2,2,2])
add_n = x_n + y_n
sub_n = x_n - y_n
mul_n = x_n * y_n
div_n = x_n / y_n
print('Numpy:\nAddition:\n',add_n,'\nSubtraction:\n',sub_n,
'\nMultiplication:\n',mul_n,'\nDivision:\n',div_n)

# ---Output---
#Tensor:
#Addition:
# tensor([3., 4., 5.]) 
#Subtraction:
# tensor([-1.,  0.,  1.]) 
#Multiplication:
# tensor([2., 4., 6.]) 
#Division:
# tensor([0.5000, 1.0000, 1.5000]) 
#
#Numpy:
#Addition:
# [3 4 5] 
#Subtraction:
# [-1  0  1] 
#Multiplication:
# [2 4 6] 
#Division:
# [0.5 1.  1.5]

How to use Tensor (Part 2: Conversion / Automatic Differentiation)

Autograd: Automatic Differentiation -- PyTorch Tutorials 1.4.0 documentation

Array shape manipulation

The shape information (number of rows, number of columns) of the array can be obtained by the shape method. It behaves like Numpy.

# Tensor
x_t = torch.rand(4,3)
row_t = x_t.shape[0]
column_t = x_t.shape[1]
print('Tensor:\n','row: ',row_t,'column: ',column_t)

# Numpy
x_n = np.random.rand(4,3)
row_n = x_n.shape[0]
column_n = x_n.shape[1]
print('Numpy:\n','row: ',row_n,'column: ',column_n)

# ---Output---
#Tensor:
# row:  4 column:  3
#Numpy:
# row:  4 column:  3

If you want to change the shape of the array, you often use .view () in PyTorch and .reshape () in Numpy. However, you can use .reshape () for PyTorch's Tensor as well as Numpy.

# Tensor
x_t = torch.rand(4,3)
y_t = x_t.view(12)
z_t = x_t.view(2,-1)
print('Tensor:\n',x_t,'\n',y_t,'\n',z_t,'\n')

# Numpy
x_n = np.random.rand(4,3)
y_n = x_n.reshape(12)
z_n = x_n.reshape([2,-1])
print('Numpy:\n',x_n,'\n',y_n,'\n',z_n)

# ---Output---
#Tensor:
# tensor([[0.5357, 0.2716, 0.2651],
#        [0.6570, 0.0844, 0.9729],
#        [0.4436, 0.9271, 0.4013],
#        [0.8725, 0.2952, 0.1330]]) 
# tensor([0.5357, 0.2716, 0.2651, 0.6570, 0.0844, 0.9729, 0.4436, 0.9271, 0.4013,
#        0.8725, 0.2952, 0.1330]) 
# tensor([[0.5357, 0.2716, 0.2651, 0.6570, 0.0844, 0.9729],
#        [0.4436, 0.9271, 0.4013, 0.8725, 0.2952, 0.1330]]) 
#
#Numpy:
# [[0.02711389 0.24172801 0.01202486]
# [0.59552453 0.49906154 0.81377212]
# [0.24744639 0.58570244 0.26464142]
# [0.14519645 0.03607043 0.46616757]] 
# [0.02711389 0.24172801 0.01202486 0.59552453 0.49906154 0.81377212
# 0.24744639 0.58570244 0.26464142 0.14519645 0.03607043 0.46616757] 
# [[0.02711389 0.24172801 0.01202486 0.59552453 0.49906154 0.81377212]
# [0.24744639 0.58570244 0.26464142 0.14519645 0.03607043 0.46616757]]

When using .reshape ().

# Tensor
x_t = torch.rand(4,3)
y_t = x_t.reshape(2,-1)
#y_t = torch.reshape(x_t,[2,-1]) <-- Also works
print('Tensor:\n',y_t,'\n')

# Numpy
x_n = np.random.rand(4,3)
y_n = x_n.reshape(2,-1)
#y_n = np.reshape(x_n,[2,-1]) <-- Also works
print('Numpy:\n',y_n)

# ---Output---
#Tensor:
#tensor([[0.0617, 0.4898, 0.4745, 0.8218, 0.3760, 0.1556],
#        [0.3192, 0.5886, 0.8385, 0.5321, 0.9758, 0.8254]])
#
#Numpy:
#[[0.60080911 0.55132561 0.75930606 0.03275005 0.83148483 0.48780054]
# [0.10971541 0.02317271 0.22571149 0.95286975 0.93045979 0.82358474]]

Array transpose is done with .transpose () or .t () in PyTorch and with .transpose () or .T in Numpy.

# Tensor
x_t = torch.rand(3,2)
xt_t = x_t.transpose(0,1)
#xt_t = torch.transpose(x_t,0,1)
#xt_t = x_t.t()
print('Tensor:\n',x_t,'\n',xt_t)

# Numpy
x_n = np.random.rand(3,2)
xt_n = x_n.transpose()
#xt_n = np.transpose(x_n)
#xt_n = x_n.T
print('Numpy:\n',x_n,'\n',xt_n)
# ---Output---
#Tensor:
# tensor([[0.8743, 0.8418],
#        [0.6551, 0.2240],
#        [0.9447, 0.2824]]) 
# tensor([[0.8743, 0.6551, 0.9447],
#        [0.8418, 0.2240, 0.2824]])
#Numpy:
# [[0.80380702 0.81511741]
# [0.29398279 0.78025418]
# [0.19421487 0.43054298]] 
# [[0.80380702 0.29398279 0.19421487]
# [0.81511741 0.78025418 0.43054298]]

Conversion with Numpy

Tensor --> ndarray

To convert from Tensor to ndarray, use Tensor.numpy (). The converted ndarray is not affected by the change of the reference source Tensor. (ndarrya is a copy of Tensor.) If you want to link, you need to use in-place operation (add _ to the end of each function. For example, ʻadd_ ()`.) is there.

a = torch.ones(5)
b = a.numpy()
a = a + 1
print('a = ',a)
print('b = ',b)

# ---Output---
# a =  tensor([2., 2., 2., 2., 2.])
# b =  [1. 1. 1. 1. 1.]

a = torch.ones(5)
b = a.numpy()
a.add_(1)
#torch.add(a,1,out=a) <-- Same operation
print('a = ',a)
print('b = ',b)

# ---Output---
# a =  tensor([2., 2., 2., 2., 2.])
# b =  [2. 2. 2. 2. 2.]

Tensor <-- ndarray

To convert from ndarray to Tensor, use torch.from_numpy (ndarray).

a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print('a = ',a)
print('b = ',b)
# ---Output---
# a =  [2. 2. 2. 2. 2.]
# b =  tensor([2., 2., 2., 2., 2.], dtype=torch.float64)

CUDA Tensor Tensor can move the calculation area by using the.to ()method. This allows you to move Tensor from CPU memory to GPU memory and perform calculations.

x = torch.rand(2,3)

if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

# ---Output---
#tensor([[1.1181, 1.1125, 1.3122],
#        [1.1282, 1.5595, 1.4443]], device='cuda:0')
#tensor([[1.1181, 1.1125, 1.3122],
#        [1.1282, 1.5595, 1.4443]], dtype=torch.float64)

Automatic differentiation

By setting the requires_grad attribute of torch.Tensor to True, all calculation history can be tracked. By calling the backward () method when the calculation is completed, all the derivatives are automatically executed. The derivative is stored in the grad attribute. If you want to stop tracking the calculation history, you can call the detach () method to separate it from the tracking of the calculation history.

Each Tensor has a grad_fn attribute. This attribute refers to the Function class that creates the Tensor. (Strictly speaking, the user-defined Tensor does not have the grad_fn attribute, and the Tensor in the calculation result is given the grad_fn attribute.)

x = torch.ones(2, 2, requires_grad=True)
print(x)
# ---Output---
#tensor([[1., 1.],
#        [1., 1.]], requires_grad=True)

y = x + 2
print(y)
# ---Output---
#tensor([[3., 3.],
#        [3., 3.]], grad_fn=<AddBackward0>)

print(x.grad_fn)
print(y.grad_fn)
# ---Output---
# None
# <AddBackward0 object at 0x7f2285d93940>]

z = y * y * 3
out = z.mean()
print(z)
print(out)
# ---Output---
#tensor([[27., 27.],
#        [27., 27.]], grad_fn=<MulBackward0>)
#tensor(27., grad_fn=<MeanBackward0>)

print(z.grad_fn)
# ---Output---
#<MulBackward0 object at 0x7f2285d93ac8>

out.backward()
print(x.grad)
# ---Output---
#tensor([[4.5000, 4.5000],
#        [4.5000, 4.5000]])

When I actually calculated the final result,

out = \frac{1}{4} \sum_{i} z_i \\
z_i = y_i \cdot y_i \cdot 3 = 3 \cdot (x_i+2)^2

Therefore,

\frac{\partial out}{\partial x_i} = \frac{1}{4} \cdot 3 \cdot 2 \cdot (x_i+2) = 4.5

It is confirmed that

Summary

The main points of this article are summarized below.

--PyTorch is a Define by Run machine learning library. --Use an array called torch.Tensor that enables high-speed calculation and automatic differentiation. This can be used (defined / operated) in almost the same way as Numpy's numpy.ndaray, and can be easily converted to each other. --By setting the requires_grad attribute of torch.Tensor to True, the calculation history can be traced, and by calling thebackward ()method at the end of the calculation, automatic differentiation is performed. This is very useful for updating parameters by the error back propagation method of neural networks.

[PYTHON] Basics of PyTorch (1) -How to use Tensor-