[PYTHON] Basics of PyTorch (2) -How to make a neural network-

Neural network with PyTorch

I referred to the following official reference. Neural Networks -- PyTorch Tutorials 1.4.0 documentation

The general procedure for training a neural network is as follows. ** 1. Prepare data (training data / test data). ** ** ** 2. Define a neural network with trainable parameters. (Define the network) ** ** 3. Calculate the loss function when training data is input to the network. (Loss function) ** ** 4. Calculate the slope of the loss function with respect to network parameters. (Backward) ** ** 5. Update the parameters based on the gradient of the loss function. (Optimize) ** ** 6. Train by repeating 3 to 6 many times. ** **

Build a neural network according to the procedure.

1. Data preparation

For the data used for training the neural network, use the data already prepared in the package, or use the data prepared by yourself.

If you want to use the one that is already prepared, it is convenient to use the torchvision package. Data sets torchvision.datasets such as MNIST and CIFAR10, which are often used in machine learning, are prepared, as well as a general-purpose machine learning model torchvision.models and a module torchvision.transforms for data processing. Has been done. See official documentation for details-> torchvision

When executing the training, prepare a data box called torch.utils.data.DataLoader. DataLoader is a set of data that combines the input data and its label, and is a batch size.

The preparation procedure is as follows. ** (1) Prepare transforms to preprocess data. ** ** ** (2) Instantiate the Dataset class with transforms as an argument to prepare Dataset. ** ** ** (3) Instantiate the DataLoader class with Dataset as an argument to prepare DataLoader. ** ** ** (4) At the time of training, use DataLoader to acquire training data and labels in batch size chunks. ** **

2. Definition of neural network

Neural networks can be constructed using the torch.nn package. nn executes the definition and differentiation of the model by using the automatic differentiation ʻautograd`.

nn.Module has various layers of neural network andforward (input)method. Therefore, when constructing a new network, the nn.Module class should be inherited.

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 3x3 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 6 * 6, 120)  # 6*6 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

net = Net()
print(net)

# ---Output---
#Net(
#  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
#  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
#  (fc1): Linear(in_features=576, out_features=120, bias=True)
#  (fc2): Linear(in_features=120, out_features=84, bias=True)
#  (fc3): Linear(in_features=84, out_features=10, bias=True)
#)

Define the layer held by the network with the __init__ () method. Most commonly used layers such as Linear and Conv2d are defined in torch.nn. See official documentation for details-> torch.nn

Similarly, processing such as relu and max_pool2d is defined in torch.nn.functional. It can be called and used as appropriate when processing is required. See official documentation for details-> torch.nn.functional

Define the forward propagation of the network with the forward () method. The layers to be passed and the processing to be executed until the input x is output are defined in order.

It is not necessary to define backward (), which is the back propagation of the network. By defining forward () and using ʻautograd`. Back propagation is automatically obtained.

Trainable parameters can be obtained with net.parameters (). Since the weight parameter and the bias parameter are acquired separately, a list of parameters with a length of $ \ times $ 2, which is the number of defined layers, is obtained.

params = list(net.parameters())
print(len(params))
print(params[0].size())   # conv1's weight
print(params[1].size())   # conv1's bias
print(params[0][0,:,:,:]) # conv1's weights on the first dimension

# ---Output---
#10
#torch.Size([6, 1, 3, 3])
#torch.Size([6])
#tensor([[[-0.0146, -0.0219,  0.0491],
#         [-0.3047, -0.0137,  0.0954],
#         [-0.2612, -0.2972, -0.2798]]], grad_fn=<SliceBackward>)

Enter appropriate data of $ 32 \ times 32 $ for this network.

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

# ---Output---
#tensor([[-0.0703,  0.0575, -0.0679, -0.1168, -0.1093,  0.0815, -0.0085,  0.0408,
#          0.1275,  0.0472]], grad_fn=<AddmmBackward>)

The input random number is output through the layer with the initial parameters.

You can make the gradient of all parameters zero with the zero_grad () method. It is recommended to run zero_grad () before running backward () to avoid unexpected parameter updates.

torch.nn assumes that a mini-batch is input. For example, nn.Conv2d needs to prepare a 4-dimensional Tensor ($ \ rm {nSamples} \ times nChannels \ times Height \ times Width $) as an input.

3. Loss function

Commonly used loss functions such as MSELoss () and CrossEntropyLoss () are provided in the nn package. In the following, MSE Loss is calculated using the output value when a random number is input and a sequence of random numbers of the same size.

input = torch.randn(1, 1, 32, 32)
output = net(input)
target = torch.randn(10)    # a dummy target, for example
target = target.view(1,-1)  # make it the same shape as output
criterion = nn.MSELoss()
loss = criterion(output, target)
print(loss)

# ---Output---
#tensor(0.5322, grad_fn=<MseLossBackward>)

If you follow the forward propagation so far,

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d 
      -> view -> linear -> relu -> linear -> relu -> linear 
      -> MSELoss 
      -> loss

It can be confirmed by looking at the grad_fn attribute.

print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

# ---Output---
#<MseLossBackward object at 0x7f5008a1c4e0>
#<AddmmBackward object at 0x7f5008a1c5c0>
#<AccumulateGrad object at 0x7f5008a1c4e0>

4. Gradient calculation

The gradient of the loss function is required to perform error backpropagation for parameter update. In PyTorch, if you execute loss.backward () for the loss function loss, the gradient will be calculated automatically. In order to avoid the accumulation of gradients, it is recommended to execute net.zero_grad () for each iteration during training to eliminate the gradients.

net.zero_grad()     # zeroes the gradient buffers of all parameters
print("conv1.bias.grad before backward")
print(net.conv1.bias.grad)

loss.backward()
print("conv1.bias.grad after backward")
print(net.conv1.bias.grad)

# ---Output---
#conv1.bias.grad before backward
#tensor([0., 0., 0., 0., 0., 0.])
#conv1.bias.grad after backward
#tensor([ 0.0072, -0.0051, -0.0008, -0.0017,  0.0043, -0.0030])

5. Parameter update

Parameter update (optimization) can be quoted from torch.optim. Here, we try to use the stochastic gradient descent method (SGD) defined by the following equation. See official documentation for details-> torch.optim

weight -> weight - learning_rate * gradient
import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output,target)
loss.backward()
optimizer.step()        # do the update

6. Training

Network training is performed by repeating steps 3 to 6 above.

Implementation using CIFAR10

As an example, we train a neural network that classifies images using CIFAR10. I referred to the official reference below. Training a Classifier -- PyTorch Tutorials 1.4.0 documentation

Data preparation

Acquire and standardize the CIFAR10 data provided in torchvision.datasets. Since the data in the torchvision dataset is a PILImage with values in the range [0,1], it is standardized here as a Tensor with values in the range [-1,1].

import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Let's display the prepared data.

import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
    img = img/2 + 0.5 # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1,2,0)))
    plt.show()
    
# get some random training imges
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))

# print labels
print(''.join('%5s' % classes[labels[j]] for j in range(4)))

[Output] image.png

Network construction

Next, we build a network for classifying images.

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

Definition of loss function and optimization method

Once the network is built, define the loss function and optimization method.

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Training

Once the network, loss function, and optimization method have been defined, training is started using the training data.

for epoch in range(2): # loop over the dataset multiple times
    
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        
        # zero the parameter gradients
        optimizer.zero_grad()
        
        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs,labels)
        loss.backward()
        optimizer.step()
        
        # print statistics
        running_loss += loss.item()
        if i%2000==1999: # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' % (epoch+1, i+1, running_loss/2000))
            running_loss = 0.0
            
print('Finished Training')

# ---Output---
#[1,  2000] loss: 2.149
#[1,  4000] loss: 1.832
#[1,  6000] loss: 1.651
#[1,  8000] loss: 1.573
#[1, 10000] loss: 1.514
#[1, 12000] loss: 1.458
#[2,  2000] loss: 1.420
#[2,  4000] loss: 1.371
#[2,  6000] loss: 1.348
#[2,  8000] loss: 1.333
#[2, 10000] loss: 1.326
#[2, 12000] loss: 1.293
#Finished Training

Here, training using all 12000 training data is performed twice. As the amount of data used for training increases, the loss function loss becomes smaller, so it is possible to observe the progress of learning. (It seems that learning has not been completed yet, but this time we will stop here and move on.)

Save model parameters

The parameters of the trained model can be saved with torch.save ().

PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

Apply to test data

Apply a trained network to the test data. First, check the contents of the test data.

dataiter = iter(testloader)
images, labels = dataiter.next()

imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

[Output] image.png

Then, read the saved network parameters. After that, input the test data into the read model and display the classification result.

net = Net()
net.load_state_dict(torch.load(PATH))
# ---Output---
# <All keys matched successfully>

outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))
# ---Output---
# Predicted:    cat  ship plane plane

The third image is misjudged as plane instead of ship, but the other three are correctly classified.

Let's calculate the correct answer rate for all 10000 test data.

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
print('Accuracy of the network on the 10000 test images: %d %%' % (100*correct/total))
# ---Output---
# Accuracy of the network on the 10000 test images: 52 %

The correct answer rate is 52%, which is not very accurate as an image classifier.

Next, try to obtain the correct answer rate for each type of classification.

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))

with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs,1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1
            
for i in range(10):
    print('Accuracy of %5s : %2d %%' % ( classes[i], 100*class_correct[i]/class_total[i]))

# ---Output---
# Accuracy of plane : 61 %
# Accuracy of   car : 61 %
# Accuracy of  bird : 52 %
# Accuracy of   cat : 26 %
# Accuracy of  deer : 34 %
# Accuracy of   dog : 51 %
# Accuracy of  frog : 67 %
# Accuracy of horse : 43 %
# Accuracy of  ship : 76 %
# Accuracy of truck : 50 %

From this, it can be seen that although we are not good at classifying cats, we are good at classifying ships.

When using GPU

When training on GPU, it is necessary to specify CUDA device with device. First, check if the GPU is available. If the code below shows cuda: 0, the GPU is available.

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print(device)

# ---Output---
# cuda:0

You can move networks and data on the GPU with .to (device). When training, don't forget to move the data to the GPU for each iteration.

net.to(device)
inputs, labels = data[0].to(device), data[1].to(device)

Summary

Finally, the above procedure is summarized in one code.

# import packages -------------------------------
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# prepare data ----------------------------------
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# define a network ------------------------------
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

# define loss function and optimizer -------------
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# start training ---------------------------------
for epoch in range(2): # loop over the dataset multiple times
    
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        
        # zero the parameter gradients
        optimizer.zero_grad()
        
        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs,labels)
        loss.backward()
        optimizer.step()
        
        # print statistics
        running_loss += loss.item()
        if i%2000==1999: # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' % (epoch+1, i+1, running_loss/2000))
            running_loss = 0.0
            
print('Finished Training')

# check on test data ----------------------------
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
print('Accuracy of the network on the 10000 test images: %d %%' % (100*correct/total))

Recommended Posts

Basics of PyTorch (2) -How to make a neural network-
Basics of PyTorch (1) -How to use Tensor-
How to make a Japanese-English translation
Implementation of a two-layer neural network 2
How to make a crawler --Advanced
How to make a recursive function
How to make a deadman's switch
[Blender] How to make a Blender plugin
How to make a crawler --Basic
[Python] How to make a list of character strings character by character
[Python] How to make a class iterable
Try to make a kernel of Jupyter
How to make a Backtrader custom indicator
How to make a Pelican site map
How to easily draw the structure of a neural network on Google Colaboratory using "convnet-drawer"
[Python] How to make a matrix of repeating patterns (repmat / tile)
[NNabla] How to remove the middle tier of a pre-built network
I tried how to improve the accuracy of my own Neural Network
How to make a dialogue system dedicated to beginners
How to calculate the volatility of a brand
Visualize the inner layer of a neural network
How to make a dictionary with a hierarchical structure.
How to make a QGIS plugin (package generation)
I read "How to make a hacking lab"
[PyTorch] Sample ⑧ ~ How to build a complex model ~
Lark Basics (Python, Lark to make a shell-like guy)
Basics of network programs?
[NNabla] How to get the output (variable) of the middle layer of a pre-built network
How to make a Raspberry Pi that speaks the tweets of the specified user
A new form of app that works with GitHub: How to make GitHub Apps
A memo of how to use AIST supercomputer ABCI
Train MNIST data with a neural network in PyTorch
How to write a list / dictionary type of Python3
The story of making a music generation neural network
How to make a Python package using VS Code
Implementation of a convolutional neural network using only Numpy
The world's most easy-to-understand explanation of how to make a LINE BOT (1) [Account preparation]
[NNabla] How to add a new layer between the middle layers of a pre-built network
Implement a 3-layer neural network
How to call a function
How to hack a terminal
Pytorch Neural Network (CNN) Tutorial 1.3.1.
I tried to implement a basic Recurrent Neural Network model
How to make a Cisco Webex Teams BOT with Flask
[Ubuntu] How to delete the entire contents of a directory
Try to build a deep learning / neural network with scratch
How to shuffle a part of a Python list (at random.shuffle)
How to make a multiplayer online action game on Slack
How to make a hacking lab-Kali Linux (2020.1) VirtualBox 64-bit Part 2-
How to make a hacking lab-Kali Linux (2020.1) VirtualBox 64-bit edition-
How to make a Python package (written for an intern)
How to make a simple Flappy Bird game with pygame
How to display a list of installable versions with pyenv
How to register a package on PyPI (as of September 2017)
How to get a list of built-in exceptions in python
How to find the scaling factor of a biorthogonal wavelet
Python learning basics ~ How to output (display) a character string? ~
Construction of a neural network that reproduces XOR by Z3
How to get a list of links from a page from wikipedia
How to get a quadratic array of squares in a spiral!
How to connect the contents of a list into a string