[PYTHON] "Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight

While reading "Deep Learning from scratch" (written by Yasuki Saito, published by O'Reilly Japan), I will make a note of the sites I referred to. Part 10 ← → Part 11

I changed the source code ch06 / optimizer_compare_mnist.py used in the comparison of the update method using the MNIST data set a little, and tried several ways to set the initial value.

# coding: utf-8
import os
import sys
sys.path.append(os.pardir)  #Settings for importing files in the parent directory
import matplotlib.pyplot as plt
from dataset.mnist import load_mnist
from common.util import smooth_curve
from common.multi_layer_net import MultiLayerNet
from common.optimizer import *

# 0:Read MNIST data==========
(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True)

train_size = x_train.shape[0]
batch_size = 128
max_iterations = 2000

# 1:Experiment settings==========
optimizers = {}
optimizers['SGD'] = SGD()
optimizers['Momentum'] = Momentum()
optimizers['AdaGrad'] = AdaGrad()
optimizers['Adam'] = Adam()
#optimizers['RMSprop'] = RMSprop()

networks = {}
train_loss = {}
for key in optimizers.keys():
    networks[key] = MultiLayerNet(
        input_size=784, hidden_size_list=[100, 100, 100, 100],
        output_size=10,
        activation='relu',weight_init_std='relu', 
        weight_decay_lambda=0)
    train_loss[key] = []    

# 2:Start of training==========
for i in range(max_iterations):
    batch_mask = np.random.choice(train_size, batch_size)
    x_batch = x_train[batch_mask]
    t_batch = t_train[batch_mask]
    
    for key in optimizers.keys():
        grads = networks[key].gradient(x_batch, t_batch)
        optimizers[key].update(networks[key].params, grads)
    
        loss = networks[key].loss(x_batch, t_batch)
        train_loss[key].append(loss)

#Evaluation with test data
x = x_test
t = t_test

for key in optimizers.keys():
    network = networks[key]

    y = network.predict(x)

    accuracy_cnt = 0
    for i in range(len(x)):
        p= np.argmax(y[i])
        if p == t[i]:
            accuracy_cnt += 1

    print(key + " Accuracy:" + str(float(accuracy_cnt) / len(x))) 

Specify'relu'in the activation function and set "Initial value of He"

activation='relu', weight_init_std='he', 

Result of processing test data

SGD Accuracy:0.9325 Momentum Accuracy:0.966 AdaGrad Accuracy:0.9707 Adam Accuracy:0.972

Specify'sigmoid'in the activation function and set "Initial value of Xavier"

activation='sigmoid', weight_init_std='xavier', 

Result of processing test data

SGD Accuracy:0.1135 Momentum Accuracy:0.1028 AdaGrad Accuracy:0.9326 Adam Accuracy:0.9558

SGD and Momentum had a bad recognition rate, so I set the number of batches to 10000.

SGD Accuracy:0.1135 Momentum Accuracy:0.9262 AdaGrad Accuracy:0.9617 Adam Accuracy:0.9673 g6-9.jpg

Momentum's recognition rate has risen so much, but SGD is totally useless.

Specify'relu'as the activation function and set the normal distribution with a standard deviation of 0.01 as the initial value.

activation='relu', weight_init_std=0.01, 

Result of processing test data

SGD Accuracy:0.1135 Momentum Accuracy:0.1135 AdaGrad Accuracy:0.9631 Adam Accuracy:0.9713

SGD and Momentum don't seem to be learning at all.

Part 10 ← → Part 11

Recommended Posts

"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight
"Deep Learning from scratch" Self-study memo (No. 11) CNN
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4
"Deep Learning from scratch" self-study memo (No. 18) One! Meow! Grad-CAM!
"Deep Learning from scratch" self-study memo (No. 19-2) Data Augmentation continued
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
"Deep Learning from scratch" self-study memo (No. 15) TensorFlow beginner tutorial
[Deep Learning from scratch] Initial value of neural network weight when using Relu function
"Deep Learning from scratch" self-study memo (No. 13) Try using Google Colaboratory
"Deep Learning from scratch" self-study memo (unreadable glossary)
"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class
"Deep Learning from scratch" Self-study memo (No. 16) I tried to build SimpleConvNet with Keras
"Deep Learning from scratch" Self-study memo (No. 17) I tried to build DeepConvNet with Keras
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
Deep learning / Deep learning made from scratch Chapter 7 Memo
Learning record of reading "Deep Learning from scratch"
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
"Deep Learning from scratch" Self-study memo (No. 14) Run the program in Chapter 4 on Google Colaboratory
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
TensorFlow> Learning sine curve> Reproduction of learning result from weight, bias v0.3 (Failure) / python> pass: no operation
Installation of TensorFlow, a machine learning library from Google
"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight
Deep Learning from scratch 1-3 chapters
Application of Deep Learning 2 made from scratch Spam filter
Deep Learning / Deep Learning from Zero 2 Chapter 4 Memo
Deep Learning / Deep Learning from Zero Chapter 3 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 5 Memo
Deep learning from scratch (cost calculation)
Deep Learning / Deep Learning from Zero 2 Chapter 7 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 8 Memo
Deep Learning / Deep Learning from Zero Chapter 5 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 3 Memo
Deep Learning / Deep Learning from Zero 2 Chapter 6 Memo
"Deep Learning from scratch" Self-study memo (Part 8) I drew the graph in Chapter 6 with matplotlib
Why ModuleNotFoundError: No module named'dataset.mnist' appears in "Deep Learning from scratch".
Write an impression of Deep Learning 3 framework edition made from scratch
Deep learning from scratch (forward propagation edition)
Deep learning / Deep learning from scratch 2-Try moving GRU
"Deep Learning from scratch" in Haskell (unfinished)
[Windows 10] "Deep Learning from scratch" environment construction
[Deep Learning from scratch] About hyperparameter optimization
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Good book "Deep Learning from scratch" on GitHub
Python vs Ruby "Deep Learning from scratch" Summary
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
[Deep Learning from scratch] I implemented the Affine layer
Django memo # 1 from scratch
Deep learning 1 Practice of deep learning
Python vs Ruby "Deep Learning from scratch" Chapter 1 Graph of sin and cos functions
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
[Deep Learning from scratch] I tried to explain Dropout
Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch
A memo when executing the deep learning sample code created from scratch with Google Colaboratory
Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch
An amateur stumbled in Deep Learning from scratch Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5