PRML Chapter 5 Neural Network Python Implementation

Chapter 5 of PRML introduces recently popular neural networks. There are many kinds of neural networks on the net, so I wanted to use something that I wasn't familiar with as much as possible, so I decided to implement a mixed density network almost exclusively with Numpy. However, since the amount of code has become quite large, I will divide it into two parts, and in this article I will implement a normal neural network after all, and the mixed density network will be the next time.

neural network

Network structure

2000px-Artificial_neural_network.svg.png This is 3D $ {\ bf x} = (x_1, x_2, x_3) $ for the input unit, 4D $ {\ bf z} = (z_1, z_2, z_3, z_4) $ for the hidden unit, and 2 output units. This is a schematic representation of a two-layer neural network with dimensions $ {\ bf z} = (z_1, z_2) . From the input, the hidden unit is the first layer ( {\ bf x} \ to {\ bf z} ), and from the hidden unit, the output is the second layer ( {\ bf z} \ to {\ bf y} $). It has become. The dimensions of the unit, the number of layers, etc. need to be changed according to the problem.

Forward propagation

Forward propagation is the step of calculating the output of the network from the input. Calculate the hidden unit $ {\ bf z} $ from the input $ {\ bf x} $, and then the output $ {\ bf y} $ from the hidden unit $ {\ bf z} $.

One of the first layer hidden units, $ z_1 $,

\begin{align}
a_{z_1} &= w_{11}^{(1)}x_1+w_{12}^{(1)}x_2+w_{13}^{(1)}x_3 + b_1^{(1)}\\
z_1 &= f^{(1)}(a_{z_1})
\end{align}

Is calculated. Where $ a_ {z_1} $ is the activity of the first hidden unit, $ w_ {1j} ^ {(1)} $ is the weight from the jth input unit to the first hidden unit, $ b_1 $ is The bias of the first hidden unit, $ f ^ {(1)} $, is the activation function of the first layer.

It is possible to formulate the same formulas for $ z_2, z_3, z_4 $, but it becomes complicated to have three formulas, so I often use a matrix.

\begin{align}
\begin{bmatrix}
a_{z_1}\\
a_{z_2}\\
a_{z_3}\\
a_{z_4}
\end{bmatrix}
&=
\begin{bmatrix}
w_{11}^{(1)} & w_{12}^{(1)} & w_{13}^{(1)}\\
w_{21}^{(1)} & w_{22}^{(1)} & w_{23}^{(1)}\\
w_{31}^{(1)} & w_{32}^{(1)} & w_{33}^{(1)}\\
w_{41}^{(1)} & w_{42}^{(1)} & w_{43}^{(1)}
\end{bmatrix}
\begin{bmatrix}
x_1\\
x_2\\
x_3
\end{bmatrix}
+
\begin{bmatrix}
b_1^{(1)}\\
b_2^{(1)}\\
b_3^{(1)}\\
b_4^{(1)}
\end{bmatrix}
\\
\begin{bmatrix}
z_1\\
z_2\\
z_3\\
z_4
\end{bmatrix}
&=
\begin{bmatrix}
f^{(1)}(a_{z_1})\\
f^{(1)}(a_{z_2})\\
f^{(1)}(a_{z_3})\\
f^{(1)}(a_{z_4})
\end{bmatrix}
\end{align}

Make this more concise

\begin{align}
{\bf a}_z &= W^{(1)}{\bf x} + {\bf b}^{(1)}\\
{\bf z} &= f^{(1)}({\bf a}_z)
\end{align}

Also expressed as. This completes the forward propagation of the first layer.

In the same way, the second layer

\begin{align}
{\bf a}_y &= W^{(2)}{\bf y} + {\bf b}^{(2)}\\
{\bf y} &= f^{(2)}({\bf a}_y)
\end{align}

Can be expressed as.

By the way, to summarize these,

{\bf y} = f^{(2)}(W^{(2)}f^{(1)}(W^{(1)}{\bf x} + {\bf b}^{(1)}) + {\bf b}^{(2)})

It will be. In this way, you can calculate from input to output.

Backpropagation

The neural network has many parameters (26 in this example). Backpropagation is an efficient way to calculate these gradients. Input and its target pair\\{{\bf x}, {\bf t}\\}, And the cost function you want to minimizeEThink about. First, input by forward propagation{\bf x}Network output from{\bf y}To calculate. Then the cost function can be calculated, for exampleE=||{\bf t} - {\bf y}||^2, And the target{\bf t}There will be an error with. This error is propagated to the input. The error in the output layer, actually the partial derivative in the output of the cost function, is

{\partial E\over\partial y_i}

is. Using this, the error of $ {\ bf a} \ _y $ and the partial differential of the cost function at $ {\ bf a} \ _y $ are

\begin{align}
{\partial E\over\partial a_{y_i}} &= {\partial E\over\partial y_i}{\partial y_i\over\partial a_{y_i}}\\
&= {\partial E\over\partial y_i}f'^{(2)}(a_{y_i})\\
(&= y_i - t_i)
\end{align}

It is obtained as. If the output layer activation function $ f ^ {(2)} $ is a canonical concatenation function such as an identity map, sigmoid function, or softmax function, the error of ** $ {\ bf a} _y $ is simple. The difference between the output and the target **. If the error of $ {\ bf a} _y $ is found, the gradient of the weight in the second layer can be found. Because

\begin{align}
{\partial a_{y_i}\over\partial w_{ij}^{(2)}} &= z_j\\
{\partial a_{y_i}\over\partial b_i^{(2)}} &= 1
\end{align}

Than,

\begin{align}
{\partial E\over\partial w_{ij}^{(2)}} &= {\partial E\over\partial a_{y_i}}{\partial a_{y_i}\over\partial w_{ij}^{(2)}}\\
&= {\partial E\over\partial a_{y_i}}z_j\\
{\partial E\over\partial b_i^{(2)}} &= {\partial E\over\partial a_{y_i}}{\partial a_{y_i}\over\partial b_i^{(2)}}\\
&= {\partial E\over\partial a_{y_i}}
\end{align}

And can be calculated. We propagated the output error to find the error in activity $ {\ bf a} _ {y} $ and used that error to calculate the gradient of the parameter. Furthermore, if the error of $ {\ bf a} \ _ y $ is obtained, not only the gradient of the weight but also the error of the input $ {\ bf z} $ of the second layer can be calculated.

\begin{align}
{\partial E\over\partial z_j} &= \sum_{i=1}^2 {\partial E\over\partial a_{y_i}}{\partial a_{y_i}\over\partial z_j}\\
&= \sum_{i=1}^2 {\partial E\over\partial a_{y_i}}w_{ij}^{(2)}
\end{align}

In this way, the error can be propagated from the output $ {\ bf y} $ of the second layer to the input $ {\ bf z} $. The error of the input $ {\ bf z} $ of the second layer is also the error of the output of the first layer, so by repeating the above, the error in the activity of the first layer and the gradient of the weight can also be obtained. You can get it.

\begin{align}
{\partial E\over\partial a_{z_i}} &= {\partial E\over\partial z_i}{\partial z_i\over\partial a_{z_i}}\\
&= {\partial E\over\partial z_i}f'^{(2)}(a_{z_i})\\
{\partial E\over\partial x_j} &= \sum_{i=1}^4 {\partial E\over\partial a_{z_i}}{\partial a_{z_i}\over\partial x_j}\\
&= \sum_{i=1}^4 {\partial E\over\partial a_{z_i}}w_{ij}^{(1)}\\
{\partial E\over\partial w_{ij}^{(1)}} &= {\partial E\over\partial a_{z_i}}{\partial a_{z_i}\over\partial w_{ij}^{(1)}}\\
&= {\partial E\over\partial a_{z_i}}x_j\\
{\partial E\over\partial b_i^{(1)}} &= {\partial E\over\partial a_{z_i}}{\partial a_{y_i}\over\partial b_i^{(1)}}\\
&= {\partial E\over\partial a_{z_i}}\\
\end{align}

In this way, the error in the output layer can be propagated to the input and the gradient required to update the weight parameters can be calculated in the process.

Backpropagation summary

Propagate in the opposite direction from the error in the output unit

Output unit Second layer activity
{\partial E\over\partial y_i} {\partial E\over\partial a_{y_i}}={\partial E\over\partial y_i}f'^{(2)}(a_{y_i})~~~(=y_i - t_i)

Calculate the gradient in the parameter from the error in activity.

\begin{align}
{\partial E\over\partial w_{ij}^{(2)}} &= {\partial E\over\partial a_{y_i}}z_j\\
{\partial E\over\partial b_i^{(2)}} &= {\partial E\over\partial a_{y_i}}
\end{align}

This is also propagated to the first layer.

Hidden unit First layer activity
{\partial E\over\partial z_j}=\sum_{i=1}^2 {\partial E\over\partial a_{y_i}}w_{ij}^{(2)} {\partial E\over\partial a_{z_j}}={\partial E\over\partial z_j}f'^{(1)}(a_{j_1})

Then the parameter gradient is obtained from the error in activity.

\begin{align}
{\partial E\over\partial w_{ij}^{(1)}} &= {\partial E\over\partial a_{z_i}}x_j\\
{\partial E\over\partial b_i^{(1)}} &= {\partial E\over\partial a_{z_i}}
\end{align}

code

Library

This time I used something other than numpy to use the truncated normal distribution for weight initialization. I usually use Tensorflow when building a neural network, but since Tensorflow uses a truncated normal distribution to initialize weights, I decided to follow it here as well.

import numpy as np
from scipy.stats import truncnorm

layer

A class that represents one layer of a neural network. The weights are initialized when the instance is created, the input is input and propagated forward, and then the error is backpropagated. Based on this class, we will build a layer to be used to actually model the neural network.

class Layer(object):

    def __init__(self, dim_input, dim_output, std=1., bias=0.):
        self.w = truncnorm(a=-2 * std, b=2 * std, scale=std).rvs((dim_input, dim_output))
        self.b = np.ones(dim_output) * bias

    def __call__(self, X):
        self.input = X
        return self.forward_propagation(X)

    def back_propagation(self, delta, learning_rate):
        # derivative with respect to activation
        delta = delta * self.activation_derivative()

        w = np.copy(self.w)
        self.w -= learning_rate * self.input.T.dot(delta)
        self.b -= learning_rate * np.sum(delta, axis=0)

        # derivative with respect to input
        return delta.dot(w.T)
Layer Description
__init__ Initialize the parameters by entering the input and output orders of this layer
__call__ Calculate the output of this layer by propagating forward from the input of this layer
back_propagation Calculate the input error of this layer by updating the parameters, including the output error and learning factor of this layer.

Layer: Conformal map

A layer whose activation function is an identity map $ f (a) = a $. We will build layers by defining new methods for calculating forward propagation and the derivative of the activation function.

class LinearLayer(Layer):

    def forward_propagation(self, X):
        return X.dot(self.w) + self.b

    def activation_derivative(self):
        return 1

Layer: Logistic sigmoid function

The layer where the activation function is the logistic sigmoid function $ f (x) = {1 \ over1 + \ exp (-x)} $. The derivative of the logistic sigmoid function is $ f'(x) = f (x) (1-f (x)) $.

class SigmoidLayer(Layer):

    def forward_propagation(self, X):
        activation = X.dot(self.w) + self.b
        self.output = 1 / (1 + np.exp(-activation))
        return self.output

    def activation_derivative(self):
        return self.output * (1 - self.output)

In this way, you can also build a layer using the hyperbolic tangent function $ \ tanh (x) $ and the normalized linear function $ \ max (x, 0) $ as the activation function.

Cost function: sum of squares error

This is an error function often used when solving ** regression problems **.

class SumSquaresError(object):

    def activate(self, X):
        return X

    def __call__(self, X, targets):
        return 0.5 * np.sum((X - targets) ** 2)

    def delta(self, X, targets):
        return X - targets

Cost function: sigmoid cross entropy

This is an error function used when you want to classify ** 2 classes **. The cross-entropy is calculated after performing a non-linear transformation with the logistic sigmoid function.

class SigmoidCrossEntropy(object):

    def activate(self, logits):
        return 1 / (1 + np.exp(-logits))

    def __call__(self, logits, targets):
        probs = self.activate(logits)
        p = np.clip(probs, 1e-10, 1 - 1e-10)
        return np.sum(-targets * np.log(p) - (1 - targets) * np.log(1 - p))

    def delta(self, logits, targets):
        probs = self.activate(logits)
        return probs - targets

In this way, Softmax cross entropy used for multi-class classification is also implemented.

neural network

Since the cost function has an activation function, the final layer should use LinearLayer. By using the approximation by the difference of finite difference, it is possible to confirm whether the error back propagation is implemented correctly.

class NeuralNetwork(object):

    def __init__(self, layers, cost_function):
        self.layers = layers
        self.cost_function = cost_function

    def __call__(self, X):
        for layer in self.layers:
            X = layer(X)
        return self.cost_function.activate(X)

    def fit(self, X, t, learning_rate):
        for layer in self.layers:
            X = layer(X)

        delta = self.cost_function.delta(X, t)
        for layer in reversed(self.layers):
            delta = layer.back_propagation(delta, learning_rate)

    def cost(self, X, t):
        for layer in self.layers:
            X = layer(X)
        return self.cost_function(X, t)

    def _gradient_check(self, X=None, t=None, eps=1e-6):
        if X is None:
            X = np.array([[0.5 for _ in xrange(np.size(self.layers[0].w, 0))]])
        if t is None:
            t = np.zeros((1, np.size(self.layers[-1].w, 1)))
            t[0, 0] = 1.

        e = np.zeros_like(X)
        e[:, 0] += eps
        x_plus_e = X + e
        x_minus_e = X - e
        grad = (self.cost(x_plus_e, t) - self.cost(x_minus_e, t)) / (2 * eps)

        for layer in self.layers:
            X = layer(X)
        delta = self.cost_function.delta(X, t)
        for layer in reversed(self.layers):
            delta = layer.back_propagation(delta, 0)

        print "==================================="
        print "checking gradient"
        print "finite difference", grad
        print " back propagation", delta[0, 0]
        print "==================================="
NueralNetwork Description
__init__ Definition of network structure and cost function
__call__ Forward propagation calculation
fit Network learning
cost Calculate the value of the cost function
_gradient_check Confirmation of backpropagation gradient

Whole code

The whole code is here. Import only what you need from this module and build code to solve regression and classification problems.

neural_network.py


import numpy as np
from scipy.stats import truncnorm


class Layer(object):

    def __init__(self, dim_input, dim_output, std=1., bias=0.):
        self.w = truncnorm(a=-2 * std, b=2 * std, scale=std).rvs((dim_input, dim_output))
        self.b = np.ones(dim_output) * bias

    def __call__(self, X):
        self.input = X
        return self.forward_propagation(X)

    def back_propagation(self, delta, learning_rate):
        # derivative with respect to activation
        delta = delta * self.activation_derivative()

        w = np.copy(self.w)
        self.w -= learning_rate * self.input.T.dot(delta)
        self.b -= learning_rate * np.sum(delta, axis=0)

        # derivative with respect to input
        return delta.dot(w.T)


class LinearLayer(Layer):

    def forward_propagation(self, X):
        return X.dot(self.w) + self.b

    def activation_derivative(self):
        return 1


class SigmoidLayer(Layer):

    def forward_propagation(self, X):
        activation = X.dot(self.w) + self.b
        self.output = 1 / (1 + np.exp(-activation))
        return self.output

    def activation_derivative(self):
        return self.output * (1 - self.output)


class TanhLayer(Layer):

    def forward_propagation(self, X):
        activation = X.dot(self.w) + self.b
        self.output = np.tanh(activation)
        return self.output

    def activation_derivative(self):
        return 1 - self.output ** 2


class ReLULayer(Layer):

    def forward_propagation(self, X):
        activation = X.dot(self.w) + self.b
        self.output = activation.clip(min=0)
        return self.output

    def activation_derivative(self):
        return (self.output > 0).astype(np.float)


class SigmoidCrossEntropy(object):

    def activate(self, logits):
        return 1 / (1 + np.exp(-logits))

    def __call__(self, logits, targets):
        probs = self.activate(logits)
        p = np.clip(probs, 1e-10, 1 - 1e-10)
        return np.sum(-targets * np.log(p) - (1 - targets) * np.log(1 - p))

    def delta(self, logits, targets):
        probs = self.activate(logits)
        return probs - targets


class SoftmaxCrossEntropy(object):

    def activate(self, logits):
        a = np.exp(logits - np.max(logits, 1, keepdims=True))
        a /= np.sum(a, 1, keepdims=True)
        return a

    def __call__(self, logits, targets):
        probs = self.activate(logits)
        p = probs.clip(min=1e-10)
        return - np.sum(targets * np.log(p))

    def delta(self, logits, targets):
        probs = self.activate(logits)
        return probs - targets


class SumSquaresError(object):

    def activate(self, X):
        return X

    def __call__(self, X, targets):
        return 0.5 * np.sum((X - targets) ** 2)

    def delta(self, X, targets):
        return X - targets


class NeuralNetwork(object):

    def __init__(self, layers, cost_function):
        self.layers = layers
        self.cost_function = cost_function

    def __call__(self, X):
        for layer in self.layers:
            X = layer(X)
        return self.cost_function.activate(X)

    def fit(self, X, t, learning_rate):
        for layer in self.layers:
            X = layer(X)

        delta = self.cost_function.delta(X, t)
        for layer in reversed(self.layers):
            delta = layer.back_propagation(delta, learning_rate)

    def cost(self, X, t):
        for layer in self.layers:
            X = layer(X)
        return self.cost_function(X, t)

    def _gradient_check(self, X=None, t=None, eps=1e-6):
        if X is None:
            X = np.array([[0.5 for _ in xrange(np.size(self.layers[0].w, 0))]])
        if t is None:
            t = np.zeros((1, np.size(self.layers[-1].w, 1)))
            t[0, 0] = 1.

        e = np.zeros_like(X)
        e[:, 0] += eps
        x_plus_e = X + e
        x_minus_e = X - e
        grad = (self.cost(x_plus_e, t) - self.cost(x_minus_e, t)) / (2 * eps)

        for layer in self.layers:
            X = layer(X)
        delta = self.cost_function.delta(X, t)
        for layer in reversed(self.layers):
            delta = layer.back_propagation(delta, 0)

        print "==================================="
        print "checking gradient"
        print "finite difference", grad
        print " back propagation", delta[0, 0]
        print "==================================="

2 class classification

Put the above neural_network.py and this file in the same directory.

binary_classification.py


import pylab as plt
import numpy as np
from neural_network import TanhLayer, LinearLayer, SigmoidCrossEntropy, NeuralNetwork


def create_toy_dataset():
    x = np.random.uniform(-1., 1., size=(1000, 2))
    labels = (np.prod(x, axis=1) > 0).astype(np.float)
    return x, labels.reshape(-1, 1)


def main():
    x, labels = create_toy_dataset()
    colors = ["blue", "red"]
    plt.scatter(x[:, 0], x[:, 1], c=[colors[int(label)] for label in labels])

    layers = [TanhLayer(2, 4), LinearLayer(4, 1)]
    cost_function = SigmoidCrossEntropy()
    nn = NeuralNetwork(layers, cost_function)
    nn._gradient_check()
    for i in xrange(100000):
        if i % 10000 == 0:
            print "step %6d, cost %f" % (i, nn.cost(x, labels))
        nn.fit(x, labels, learning_rate=0.001)

    X_test, Y_test = np.meshgrid(np.linspace(-1, 1, 100), np.linspace(-1, 1, 100))
    x_test = np.array([X_test, Y_test]).transpose(1, 2, 0).reshape(-1, 2)
    probs = nn(x_test)
    Probs = probs.reshape(100, 100)
    levels = np.linspace(0, 1, 11)
    plt.contourf(X_test, Y_test, Probs, levels, alpha=0.5)
    plt.colorbar()
    plt.xlim(-1, 1)
    plt.ylim(-1, 1)
    plt.show()


if __name__ == '__main__':
    main()

Regression

Also put this in the same directory as neural_network.py above.

regression.py


import pylab as plt
import numpy as np
from neural_network import TanhLayer, LinearLayer, SumSquaresError, NeuralNetwork


def create_toy_dataset(func, n=100):
    x = np.random.uniform(size=(n, 1))
    t = func(x) + np.random.uniform(-0.1, 0.1, size=(n, 1))
    return x, t


def main():

    def func(x):
        return x + 0.3 * np.sin(2 * np.pi * x)

    x, t = create_toy_dataset(func)

    layers = [TanhLayer(1, 6, std=1., bias=-0.5), LinearLayer(6, 1, std=1., bias=0.5)]
    cost_function = SumSquaresError()
    nn = NeuralNetwork(layers, cost_function)
    nn._gradient_check()
    for i in xrange(100000):
        if i % 10000 == 0:
            print "step %6d, cost %f" % (i, nn.cost(x, t))
        nn.fit(x, t, learning_rate=0.001)

    plt.scatter(x, t, alpha=0.5, label="observation")
    x_test = np.linspace(0, 1, 1000)[:, np.newaxis]
    y = nn(x_test)
    plt.plot(x_test, func(x_test), color="blue", label="$x+0.3\sin(2\pi x)$")
    plt.plot(x_test, y, color="red", label="regression")
    plt.legend(loc="upper left")
    plt.xlabel("x")
    plt.ylabel("y")
    plt.show()


if __name__ == '__main__':
    main()

result

If you run the code that trains the neural network that performs two-class classification, the output will be as follows.

Terminal output


===================================
checking gradient
finite difference 0.349788735199
 back propagation 0.349788735237
===================================

The implementation looks okay because the gradient calculated from the finite width difference and the error value calculated by the error backpropagation are close.

A neural network that classifies two classes using blue and red dots as training data is trained, and the two-dimensional plane is color-coded according to the output. The video shows the process of learning a neural network. (However, the above two-class classification code displays only the still image as a result of learning.) anime_xor_classification.gif

This is the result when a neural network is used for regression. The neural network is trained using the blue dots as training data, and the changes in the output of the neural network are illustrated. (However, the code of the regression problem above also displays only the still image as a result of learning.) anime_regression.gif

At the end

This time, I implemented a neural network and trained it. Next time, we will use this code to implement a mixed density network. When using a normal neural network for a regression problem, the cost function is modeled on a single-peak Gaussian function, so it cannot handle multi-peak situations. A mixed density network solves this by using a mixed Gauss as a cost function.

Recommended Posts

PRML Chapter 5 Neural Network Python Implementation
PRML Chapter 5 Mixed Density Network Python Implementation
Neural network implementation in python
PRML Chapter 3 Evidence Approximation Python Implementation
PRML Chapter 8 Product Sum Algorithm Python Implementation
PRML Chapter 4 Bayesian Logistic Regression Python Implementation
PRML Chapter 9 Mixed Gaussian Distribution Python Implementation
PRML Chapter 14 Conditional Mixed Model Python Implementation
PRML Chapter 6 Gaussian Process Regression Python Implementation
PRML Chapter 2 Student's t Distribution Python Implementation
PRML Chapter 1 Bayesian Curve Fitting Python Implementation
Implemented in Python PRML Chapter 5 Neural Networks
PRML Chapter 11 Markov Chain Monte Carlo Python Implementation
PRML Chapter 12 Bayesian Principal Component Analysis Python Implementation
Neural network with Python (scikit-learn)
Neural network implementation (NumPy only)
Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network
Simple neural network implementation using Chainer
Explanation and implementation of PRML Chapter 4
Neural network with OpenCV 3 and Python 3
Implementation of a two-layer neural network 2
Simple neural network theory and implementation
[Language processing 100 knocks 2020] Chapter 8: Neural network
PRML Chapter 7 Related Vector Machine Python Implementation for Regression Problems
PRML Chapter 13 Maximum Likelihood Estimating Python Implementation of Hidden Markov Models
PRML implementation Chapter 3 Linear basis function model
Implementation of 3-layer neural network (no learning)
Implemented in Python PRML Chapter 7 Nonlinear SVM
Implementation of "blurred" neural network using Chainer
Simple neural network implementation using Chainer-Data preparation-
Python & Machine Learning Study Memo ③: Neural Network
Implemented in Python PRML Chapter 1 Bayesian Inference
Simple neural network implementation using Chainer-Model description-
Python learning memo for machine learning by Chainer Chapter 13 Neural network training ~ Chainer completed
Parametric Neural Network
Implemented in Python PRML Chapter 3 Bayesian Linear Regression
Simple neural network implementation using Chainer-optimization algorithm setting-
[Python / Machine Learning] Why Deep Learning # 1 Perceptron Neural Network
Implemented in Python PRML Chapter 1 Polynomial Curve Fitting
RNN implementation in python
Implemented in Python PRML Chapter 4 Classification by Perceptron Algorithm
ValueObject implementation in Python
Implement Convolutional Neural Network
Bayesian optimization implementation of neural network hyperparameters (Chainer + GPyOpt)
Implement Neural Network from 1
Convolutional neural network experience
Implementation of a convolutional neural network using only Numpy
[Python] Chapter 01-01 About Python (First Python)
SVM implementation in python
Rank learning using neural network (Implementation of RankNet by Chainer)
Python sample to learn XOR with genetic algorithm with neural network
Try building a neural network in Python without using a library
Implement a 3-layer neural network
Chapter 7 [Neural Network Deep Learning] P252 ~ 275 (first half) [Learn by moving with Python! New machine learning textbook]
Python for Data Analysis Chapter 4
3. Normal distribution with neural network!
100 Language Processing Knock Chapter 1 (Python)
100 Language Processing Knock Chapter 2 (Python)
Neural network starting with Chainer
Python implementation of particle filters
Python implementation mixed Bernoulli distribution