[PYTHON] [Deep Learning from scratch] Main parameter update methods for neural networks

Introduction

This article is an easy-to-understand output of ** Deep Learning from scratch Chapter 7 Learning Techniques **. I was able to understand it myself in the humanities, so I hope you can read it comfortably. Also, I would be more than happy if you could refer to it when studying this book.

SGD SGD is a method of updating parameters by multiplying the learning coefficient and subtracting from the current parameters after finding the gradient as before. スクリーンショット 2019-11-13 15.30.35.png This method is simple and easy to implement, but since the direction indicated by the gradient is not the direction to the exact minimum value, it makes a jagged and inefficient search to the point of the parameter that is the minimum value of the loss function. Is a weak point.

Momentum A method with the added concept of speed. The parameter is updated by finding the speed at which the slope rolls from the gradient to the point with the minimum value of the loss function at the current point and adding it to the parameter. スクリーンショット 2019-11-13 15.38.21.png スクリーンショット 2019-11-13 15.38.33.png It does a zigzag search like SGD, but the inefficiency is reduced because the zigzag is reduced and the search becomes rounded zigzag compared to SGD.

AdaGrad At first, the learning coefficient is increased and updated, and then the learning coefficient is decreased and updated. ** Attenuation of learning coefficient ** is a technique used. Initially, the parameters are greatly updated, and then the parameter updates are gradually reduced. スクリーンショット 2019-11-13 15.53.06.png スクリーンショット 2019-11-13 15.53.13.png By using this method, the zigzag search can be further reduced, and an efficient search can be performed.

Adam A new method proposed in 2015, which is a combination of Momentum and AdaGrad. Since it is complicated, I will not explain it here, but it is possible to search very efficiently.

The main methods currently used are simple SGD and very efficient but complex Adam.

Recommended Posts

[Deep Learning from scratch] Main parameter update methods for neural networks
Deep Learning from scratch
Deep Learning from scratch 4.4.2 Gradient for neural networks The question about the numerical_gradient function has been solved.
Deep Learning from scratch 1-3 chapters
[Deep Learning from scratch] Speeding up neural networks I explained back propagation processing
Create an environment for "Deep Learning from scratch" with Docker
Deep learning from scratch (cost calculation)
Deep Learning memos made from scratch
Lua version Deep Learning from scratch Part 6 [Neural network inference processing]
[Learning memo] Deep Learning made from scratch [Chapter 7]
Deep learning from scratch (forward propagation edition)
Deep learning / Deep learning from scratch 2-Try moving GRU
Deep learning / Deep learning made from scratch Chapter 6 Memo
[Learning memo] Deep Learning made from scratch [Chapter 5]
[Learning memo] Deep Learning made from scratch [Chapter 6]
"Deep Learning from scratch" in Haskell (unfinished)
Deep learning / Deep learning made from scratch Chapter 7 Memo
[Windows 10] "Deep Learning from scratch" environment construction
Learning record of reading "Deep Learning from scratch"
[Deep Learning from scratch] About hyperparameter optimization
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
[Learning memo] Deep Learning made from scratch [~ Chapter 4]
Realize environment construction for "Deep Learning from scratch" with docker and Vagrant
"Deep Learning from scratch" self-study memo (unreadable glossary)
"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Good book "Deep Learning from scratch" on GitHub
Deep Learning from scratch Chapter 2 Perceptron (reading memo)
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Python vs Ruby "Deep Learning from scratch" Summary
"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class
"Deep Learning from scratch" Self-study memo (No. 11) CNN
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
Prepare the environment for O'Reilly's book "Deep Learning from scratch" with apt-get (Debian 8)
[For beginners] After all, what is written in Deep Learning made from scratch?
[Deep Learning from scratch] Initial value of neural network weight using sigmoid function
[Deep Learning from scratch] I implemented the Affine layer
"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation
"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4
Application of Deep Learning 2 made from scratch Spam filter
[Deep Learning] Execute SONY neural network console from CUI
Study method for learning machine learning from scratch (March 2020 version)
[Deep Learning from scratch] I tried to explain Dropout
[Deep Learning from scratch] Initial value of neural network weight when using Relu function
Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch
[Deep Learning from scratch] Implement backpropagation processing in neural network by error back propagation method
[Deep Learning from scratch] Implementation of Momentum method and AdaGrad method
Try to build a deep learning / neural network with scratch
An amateur stumbled in Deep Learning from scratch Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 2
Non-information graduate student studied machine learning from scratch # 2: Neural network
An amateur stumbled in Deep Learning from scratch Note: Chapter 3
An amateur stumbled in Deep Learning from scratch Note: Chapter 7
An amateur stumbled in Deep Learning from scratch Note: Chapter 5
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 7
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 1
Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 4
"Deep Learning from scratch" self-study memo (No. 18) One! Meow! Grad-CAM!
"Deep Learning from scratch" self-study memo (No. 19-2) Data Augmentation continued
An amateur stumbled in Deep Learning from scratch Note: Chapter 4