[PYTHON] Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 08 Memo "Introduction to Neural Networks"

Contents

This is a memo for myself as I read Introduction to Natural Language Processing Applications in 15 Steps. This time, in Chapter 3, Step 08, I will write down my own points.

Preparation

--Personal MacPC: MacOS Mojave version 10.14.6 --docker version: Version 19.03.2 for both Client and Server

Chapter overview

Chapter 3 is a basic explanation of the basics of deep learning and its application to natural language processing. In Step08, as an introduction to neural networks, an overview of multi-layer perceptrons and a simple implementation using the deep learning library Keras.

08.1 Simple perceptron

The model that pushes the neurons that make up the brain cells of an organism into a mathematical model is ** Perceptron **, and the model that imitates one neuron is called ** Simple Perceptron **.

--n inputs: x1, x2, ..., xn --n weights: w1, w2, ..., wn --Fixed value bias: b --Output: z --Output function: f ()

z = f(x1w1 + x2w2 + ... + xnwn + b)

The above formula may be coded exactly as it is, but it can be written concisely and the processing speed is fast by using the vector inner product (NumPy) operation. When the simple perceptron is regarded as a discriminator, finding an appropriate value of the weights (w and b) is ** learning **.

import numpy as np

x = np.array([...])
w = np.array([...])
b = ..

z = b + np.dot(x, w)

08.2 Multilayer Perceptron

Similar to nerve cells in the brain, the output of a simple perceptron can be used as the input of another simple perceptron to create a structure in which a large number of perceptrons are connected. ** Multi-layer perceptron (MLP) ** Called.

test_mlp.py


import numpy as np

W_1 = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

x = np.array([10, 20, 30])

print(np.dot(W_1, x))
print(np.dot(x, W_1))
[140 320]
Traceback (most recent call last):
  File "test_mlp.py", line 11, in <module>
    print(np.dot(x, W_1))
ValueError: shapes (3,) and (2,3) not aligned: 3 (dim 0) != 2 (dim 0)

In the above, the weights for the two perceptrons in the first layer of MLP are stored in W_1 (2 rows and 3 columns array). By calculating the inner product of this and the input vector of 3 rows and 1 column, the output of 2 rows and 1 column is obtained (2 rows and 3 columns * 3 rows and 1 column = 2 rows and 1 column).

Of course, if you try to calculate the inner product by swapping W_1 and x, an error will occur.

While simple perceptrons can only be applied to linearly separable problems, ** multilayer perceptrons can also be applied to linearly inseparable problems **.

08.3 Deep Learning Library Keras

In 08.2, each layer was implemented as a function, but it can be described concisely by using a library.

Library loading and model initialization

import keras.layers import Dense
import keras.models import Sequential

model = Sequential()

Implementation of each layer

#1st layer implementation
model.add(Dense(units = 2, activation = 'relu', input_dim = 3))

#Second layer implementation
model.add(Dense(units = 1, activation = 'sigmoid'))

--Dense: Fully connected layer --input_dim: Number of input dimensions. The output dimension of the previous layer becomes the input dimension of the second and subsequent layers, so it can be omitted. --units: number of output dimensions --activation: activation function --Relu: In the past, sigmoid etc. were used, but they are excellent in terms of accuracy and ease of convergence. --Sigmoid: Used in the final layer of MLP for 2-class classifiers --Hyperbolic tangent: Higher performance than sigmoid used in the final layer of MLP of 2-class classifier --Softmax: Used in the final layer of MLP for multiclass classifiers --add: Add a layer to the model instance created by Sequential ()

MLP learning settings

In discriminator learning, a pair of feature vector X and correct label y is given as teacher data. The parameter (weight) of the classifier is adjusted so that the output when X is input is close to y.

model.compile(loss = 'binary_crossentropy', optimizer = Adam(lr = 0.001))

--loss: Loss function. Evaluate the identification accuracy of a model with a function that expresses the magnitude of the deviation between X and y. --binary_crossentropy: Loss function matching for two-class classifier --Optimizer: Optimization method. Adjusting the parameters is called optimization, and there are various methods. --lr: Learning rate. A parameter that determines how much the weight value is increased or decreased in one update --Large: The change in weight is too large to move back and forth around the optimum value, or diverge without converging at worst. --Small: Learning takes too long

Weight of each layer in Keras

When you instantiate a layer in Keras, the weights are implicitly ** initialized with random numbers **. If you want to access the weights, use the .get_weights () and .set_weights () methods. As I wrote in 10.3, if you want to specify the initialization of the weight, you can specify it with model.add (Dense (.., kernel_initializer =)).

08.4 Learning Multilayer Perceptron

model.fit(X, y, batch_size = 32, epochs = 100)

--batch_size: Value of how many of the training data to be trained at once for all --epochs: Value of how many times one training data is used for training --fit: Learning process. The error back propagation method is used internally

08.5 What is a neural network?

The obsession with imitation of brain cells has been abandoned, and it has been developed for the purpose of letting computers perform intelligent processing.

Keras is a TensorFlow wrapper library. TensorFlow is a famous deep learning library developed by Google and has many low-level APIs, so it is easy to implement by using Keras, which provides high-level APIs. Since most of the neural network calculations are vector / matrix operations, it is conceivable to use the GPU to learn larger neural networks at high speed.

Recommended Posts

Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 08 Memo "Introduction to Neural Networks"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 12 Memo "Convolutional Neural Networks"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 13 Memo "Recurrent Neural Networks"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 06 Memo "Identifier"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 02 Memo "Pre-processing"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 07 Memo "Evaluation"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 10 Memo "Details and Improvements of Neural Networks"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 09 Memo "Identifier by Neural Network"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 4 Step 14 Memo "Hyperparameter Search"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 04 Memo "Feature Extraction"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 05 Memo "Features Conversion"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 11 Memo "Word Embeddings"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 01 Memo "Let's Make a Dialogue Agent"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 03 Memo "Morphological Analysis and Word Separation"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 1 Memo "Preliminary Knowledge Before Beginning Exercises"
[WIP] Pre-processing memo in natural language processing
Summary from the beginning to Chapter 1 of the introduction to design patterns learned in the Java language
[Chapter 5] Introduction to Python with 100 knocks of language processing
Model using convolutional neural network in natural language processing
[Chapter 6] Introduction to scikit-learn with 100 knocks of language processing
[Chapter 3] Introduction to Python with 100 knocks of language processing
[Chapter 2] Introduction to Python with 100 knocks of language processing
[Chapter 4] Introduction to Python with 100 knocks of language processing
[Job change meeting] Try to classify companies by processing word-of-mouth in natural language with word2vec
[Natural language processing] I tried to visualize the hot topics this week in the Slack community
[Natural language processing] I tried to visualize the remarks of each member in the Slack community
[Python] Try to classify ramen shops by natural language processing
Summary of Chapter 2 of Introduction to Design Patterns Learned in Java Language
Summary of Chapter 3 of Introduction to Design Patterns Learned in Java Language
[Introduction to RasPi4] Environment construction; natural language processing system mecab, etc. .. .. ♪
Dockerfile with the necessary libraries for natural language processing in python
100 natural language processing knocks Chapter 4 Commentary
100 Language Processing Knock Chapter 1 in Python
Web application development memo in python
100 Language Processing Knock 2020 Chapter 8: Neural Net
[Language processing 100 knocks 2020] Chapter 8: Neural network
Cython to try in the shortest
Preparing to start natural language processing
From the introduction of GoogleCloudPlatform Natural Language API to how to use it
I tried to solve the 2020 version of 100 language processing [Chapter 3: Regular expressions 25-29]