[PYTHON] [Deep Learning from scratch] Implementation of Momentum method and AdaGrad method

Introduction

This article is an easy-to-understand output of Deep Learning Chapter 7 Learning Techniques Made from Zero. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.

Momentum method implementation

class Momentum:
    def __init__(self, lr=0.01, momentum=0.9):
        self.lr = lr #Learning rate
        self.momentum = momentum #Momentum constant
        self.v = None #speed
    
    def update(self, params, grads):
        if self.v is None: #Initialize the velocity of each parameter only at the beginning
            self.v = {}
            for key,val in params.items():
                self.v[key] = np.zeros_like(val) #Initialize by putting zero in the velocity of each parameter
            
        for key in params.keys():
            self.v[key] = self.momentum *self.v[key] - self.lr * grads[key] #Find the speed at the current location
            params[key] = params[key] + self.v[key]

The Momentum method uses the concept of velocity, so first create the velocity with an instance variable.

Find the velocity at the current point from the gradient and add it to the current weighting parameters to update the parameters.

Implementation of AdaGrad method

class AdaGrad: #Attenuation of learning coefficient can be performed for each parameter
    def __init__(self, lr=0.01):
        self.lr = lr
        self.h = None
        
    def update(self, params, grads):
        if self.h is None:
            self.h = {}
            for key,val in params.items():
                self.h[key] = np.zeros_like(val)
        for key in params.keys():
            self.h[key] = self.h[key] + (grads[key] * grads[key]) #Put the sum of squares of the gradients of each parameter into h
            params[key] = params[key] - ((self.lr * grads[key] )/ (np.sqrt(self.h[key]) + 1e-7))

As for the AdaDrad method, there is no need to explain it because it just implements the formula written in the previous article.

Gradually reduce the learning factor and subtract like SGD.

Recommended Posts

[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Learning record of reading "Deep Learning from scratch"
Deep Learning from scratch
Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function
Deep Learning from scratch 1-3 chapters
Deep reinforcement learning 2 Implementation of reinforcement learning
"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4
Application of Deep Learning 2 made from scratch Spam filter
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
Examination of Forecasting Method Using Deep Learning and Wavelet Transform-Part 2-
Deep learning from scratch (cost calculation)
Python vs Ruby "Deep Learning from scratch" Chapter 1 Graph of sin and cos functions
Deep Learning memos made from scratch
I considered the machine learning method and its implementation language from the tag information of Qiita
Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch
Write an impression of Deep Learning 3 framework edition made from scratch
"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight
Implementation and experiment of convex clustering method
[Learning memo] Deep Learning made from scratch [Chapter 7]
Deep learning from scratch (forward propagation edition)
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
Meaning of deep learning models and parameters
Deep learning / Deep learning from scratch 2-Try moving GRU
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
"Deep Learning from scratch" in Haskell (unfinished)
Deep learning / Deep learning made from scratch Chapter 7 Memo
[Windows 10] "Deep Learning from scratch" environment construction
[Deep Learning from scratch] About hyperparameter optimization
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
Realize environment construction for "Deep Learning from scratch" with docker and Vagrant
[Deep Learning from scratch] I tried to implement sigmoid layer and Relu layer.
[Deep Learning from scratch] Layer implementation from softmax function to cross entropy error
Deep Learning from scratch-Chapter 4 tips on deep learning theory and implementation learned in Python
Examination of exchange rate forecasting method using deep learning and wavelet transform
"Deep Learning from scratch" self-study memo (unreadable glossary)
"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Good book "Deep Learning from scratch" on GitHub
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
A memorandum of studying and implementing deep learning
Parallel learning of deep learning by Keras and Kubernetes
Python vs Ruby "Deep Learning from scratch" Summary
Implementation of Deep Learning model for image recognition
"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class
"Deep Learning from scratch" Self-study memo (No. 11) CNN
Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~
Build a python environment to learn the theory and implementation of deep learning
[Deep Learning from scratch] Initial value of neural network weight using sigmoid function
Deep learning 1 Practice of deep learning
[Deep Learning from scratch] I implemented the Affine layer
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
Machine learning #k-nearest neighbor method and its implementation and various
DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)
Study method for learning machine learning from scratch (March 2020 version)