Introduction

This is a learning record when I took the Rabbit Challenge with the aim of passing the Japan Deep Learning Association (JDLA) E qualification, which will be held on January 19, 2022.

Rabbit Challenge is a course that utilizes the teaching materials edited from the recorded video of the commuting course of "Deep learning course that can be crushed in the field". There is no support for questions, but it is a cheap course (the lowest price as of June 2020) for taking the E qualification exam.

Please check the details from the link below.

List of subjects

Applied Mathematics Machine learning Deep learning (day1) Deep learning (day2) Deep learning (day3) Deep learning (day4)

Section1: Implementation exercise

Tensorflow

Linear regression

-Changed the noise value for the $ d = 3x + 2 $ function

--Change the function $ d $ to $ d = -6x + 3 $ and change the noise value

→ It was confirmed that the larger the noise value, the larger the error and the lower the prediction accuracy.

Nonlinear regression

-Changed the noise value for the function $ d = -0.4x ^ 3 + 1.6x ^ 2-2.8x + 1 $

--Change the function $ d $ to $ d = 2.1x ^ 3-1.2x ^ 2-4.8x + 2 $ and change the noise value

→ It was confirmed that the larger the noise value, the larger the error and the lower the prediction accuracy.

Classification 1 layer (mnist)

Classification 3 layers (mnist)

--Resize hidden layer → It was confirmed that the larger the size of the hidden layer, the higher the prediction accuracy.

--Change optimizer

Classification CNN (mnist)

keras

Linear regression

Simple perceptron

`Simple_perceptron`


#Module loading
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
 
#Initialize random numbers with fixed values
np.random.seed(0)

#Creating a simple perceptron for sigmoid
model = Sequential()
model.add(Dense(input_dim=2, units=1))
model.add(Activation('sigmoid'))
model.summary()

model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.1))
 
#Training input X and correct answer data T
X = np.array( [[0,0], [0,1], [1,0], [1,1]] )
T = np.array( [[0], [1], [1], [1]] )
 
#training
model.fit(X, T, epochs=30, batch_size=1)
 
#Actual classification by diverting training input
Y = model.predict_classes(X, batch_size=1)

print("TEST")
print(Y == T)

--Changed np.random.seed (0) to np.random.seed (1)

Since the generated random numbers are fixed at different values as shown below, the learning result also changes.

`python:np.random.seed()`


import numpy as np
 
#Initialize random numbers with fixed values
np.random.seed(0)
print(np.random.randn()) -> 1.764052345967664
#Initialize random numbers with fixed values
np.random.seed(1)
print(np.random.randn()) -> 1.6243453636632417

--Changed the number of epochs to 100

It was confirmed that increasing the number of epochs decreased the loss.

--Changed to AND circuit and XOR circuit

It was confirmed that the OR circuit and AND circuit can be linearly separated, but the XOR cannot be learned because it cannot be linearly separated.

--Change batch size to 10 with OR circuit

It was confirmed that increasing the batch size increased the loss.

--Changed the number of epochs to 300

It was confirmed that increasing the number of epochs decreased the loss.

Classification (iris)

`classifiy_iris`


import matplotlib.pyplot as plt
from sklearn import datasets
iris = datasets.load_iris()
x = iris.data
d = iris.target

from sklearn.model_selection import train_test_split
x_train, x_test, d_train, d_test = train_test_split(x, d, test_size=0.2)

from keras.models import Sequential
from keras.layers import Dense, Activation
# from keras.optimizers import SGD

#Model settings
model = Sequential()
model.add(Dense(12, input_dim=4))
model.add(Activation('relu'))
# model.add(Activation('sigmoid'))
model.add(Dense(3, input_dim=12))
model.add(Activation('softmax'))
model.summary()

model.compile(optimizer='sgd', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = model.fit(x_train, d_train, batch_size=5, epochs=20, verbose=1, validation_data=(x_test, d_test))
loss = model.evaluate(x_test, d_test, verbose=0)

#Accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.ylabel('accuracy', fontsize=14)
plt.xlabel('epoch', fontsize=14)
plt.legend(['train', 'test'], loc='lower right', fontsize=14)
plt.ylim(0, 1.0)
plt.show()

--Changed the activation function of the middle layer to sigmoid

→ A decrease in accuracy was confirmed.

--Import SGD and change optimizer to SGD (lr = 0.1)

→ It seems that the variation in accuracy for each epoch has increased.

Classification (mnist)

`classify_mnist`


#Import required libraries
import sys, os
sys.path.append(os.pardir)  #Settings for importing files in the parent directory
import keras
import matplotlib.pyplot as plt
from data.mnist import load_mnist

(x_train, d_train), (x_test, d_test) = load_mnist(normalize=True, one_hot_label=True)

#Use Adam for importing and optimizing required libraries
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam

#Modeling
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.summary()

#Batch size, number of epochs
batch_size = 128
epochs = 20

model.compile(loss='categorical_crossentropy', 
              optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False), 
              metrics=['accuracy'])

history = model.fit(x_train, d_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, d_test))
loss = model.evaluate(x_test, d_test, verbose=0)
print('Test loss:', loss[0])
print('Test accuracy:', loss[1])

# Accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
# plt.ylim(0, 1.0)
plt.show()

--Changed one_hot_label of load_mnist to False, or changed the error function to sparse_categorical_crossentropy → When using'categorical_crossentropy'as an error function, the data format must be in one_hot_label format, and when using'sparse_categorical_crossentropy', it must be in a format other than one_hot_label.

--Change the value of Adam's argument → When the learning rate was changed from 0.0001 to 0.001, the prediction accuracy was almost the same, but the learning speed improved. However, when the learning rate was changed from 0.001 to 0.01, the prediction accuracy decreased.

CNN classification (mnist)

`classify_mnist_with_CNN`


#Import required libraries
import sys, os
sys.path.append(os.pardir)  #Settings for importing files in the parent directory
import keras
import matplotlib.pyplot as plt
from data.mnist import load_mnist

(x_train, d_train), (x_test, d_test) = load_mnist(normalize=True, one_hot_label=True)


#Processing to input as a matrix
batch_size = 128
num_classes = 10
epochs = 20

img_rows, img_cols = 28, 28

x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)


#Use Adam for importing and optimizing required libraries
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import Adam

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.summary()

#Batch size, number of epochs
batch_size = 128
epochs = 20

model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
history = model.fit(x_train, d_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, d_test))

# Accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
# plt.ylim(0, 1.0)
plt.show()

→ It was confirmed by CNN that a very high prediction accuracy of 99% or more can be obtained.

CIFAR-10

`classify_cifar10`


#CIFAR-Import 10 datasets
from keras.datasets import cifar10
(x_train, d_train), (x_test, d_test) = cifar10.load_data()

#CIFAR-10 normalization
from keras.utils import to_categorical
  
#Feature normalization
x_train = x_train/255.
x_test = x_test/255.
 
#Class label 1-hot vectorization
d_train = to_categorical(d_train, 10)
d_test = to_categorical(d_test, 10)
 
#Build CNN
import keras
from keras.models import Sequential
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers.core import Dense, Dropout, Activation, Flatten
import numpy as np
 
model = Sequential()
 
model.add(Conv2D(32, (3, 3), padding='same',input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
 
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
 
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
 
#compile
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
 
#Training
history = model.fit(x_train, d_train, epochs=20)
tf.executing_eagerly()

#Evaluation&Evaluation result output
print(model.evaluate(x_test, d_test))

# Accuracy
plt.plot(history.history['accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

#Save model
# model.save('./CIFAR-10.h5')

RNN (Forecast of Binary Addition)

`RNN (Forecast of Binary Addition)`


# import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

#Change logging level
tf.logging.set_verbosity(tf.logging.ERROR)

import numpy as np
import matplotlib.pyplot as plt

import keras
from keras.models import Sequential
from keras.layers.core import Dense, Dropout,Activation
from keras.layers.wrappers import TimeDistributed
from keras.optimizers import SGD
from keras.layers.recurrent import SimpleRNN, LSTM, GRU


#Prepare the data
#Number of binary digits
binary_dim = 8
#Maximum value+ 1
largest_number = pow(2, binary_dim)

# largest_Prepare binary numbers up to number
binary = np.unpackbits(np.array([range(largest_number)], dtype=np.uint8).T,axis=1)[:, ::-1]


# A,B initialization(a + b = d)
a_int = np.random.randint(largest_number/2, size=20000)
a_bin = binary[a_int] # binary encoding
b_int = np.random.randint(largest_number/2, size=20000)
b_bin = binary[b_int] # binary encoding

x_int = []
x_bin = []
for i in range(10000):
    x_int.append(np.array([a_int[i], b_int[i]]).T)
    x_bin.append(np.array([a_bin[i], b_bin[i]]).T)

x_int_test = []
x_bin_test = []
for i in range(10001, 20000):
    x_int_test.append(np.array([a_int[i], b_int[i]]).T)
    x_bin_test.append(np.array([a_bin[i], b_bin[i]]).T)

x_int = np.array(x_int)
x_bin = np.array(x_bin)
x_int_test = np.array(x_int_test)
x_bin_test = np.array(x_bin_test)


#Correct answer data
d_int = a_int + b_int
d_bin = binary[d_int][0:10000]
d_bin_test = binary[d_int][10001:20000]

model = Sequential()

model.add(SimpleRNN(units=16,
               return_sequences=True,
               input_shape=[8, 2],
               go_backwards=False,
               activation='relu',
               # dropout=0.5,
               # recurrent_dropout=0.3,
               # unroll = True,
            ))
#Output layer
model.add(Dense(1, activation='sigmoid', input_shape=(-1,2)))
model.summary()
model.compile(loss='mean_squared_error', optimizer=SGD(lr=0.1), metrics=['accuracy'])
# model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])

history = model.fit(x_bin, d_bin.reshape(-1, 8, 1), epochs=5, batch_size=2)

#Test result output
score = model.evaluate(x_bin_test, d_bin_test.reshape(-1,8,1), verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Epoch 1/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.0810 - accuracy: 0.9156 Epoch 2/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.0029 - accuracy: 1.0000 Epoch 3/5 10000/10000 [==============================] - 28s 3ms/step - loss: 9.7898e-04 - accuracy: 1.0000 Epoch 4/5 10000/10000 [==============================] - 28s 3ms/step - loss: 4.8765e-04 - accuracy: 1.0000 Epoch 5/5 10000/10000 [==============================] - 28s 3ms/step - loss: 3.3379e-04 - accuracy: 1.0000 Test loss: 0.0002865186012966387 Test accuracy: 1.0

--Changed the number of RNN output nodes to 128 Epoch 1/5 10000/10000 [==============================] - 29s 3ms/step - loss: 0.0741 - accuracy: 0.9194 Epoch 2/5 10000/10000 [==============================] - 29s 3ms/step - loss: 0.0020 - accuracy: 1.0000 Epoch 3/5 10000/10000 [==============================] - 29s 3ms/step - loss: 6.9808e-04 - accuracy: 1.0000 Epoch 4/5 10000/10000 [==============================] - 28s 3ms/step - loss: 4.0773e-04 - accuracy: 1.0000 Epoch 5/5 10000/10000 [==============================] - 28s 3ms/step - loss: 2.8269e-04 - accuracy: 1.0000 Test loss: 0.0002439875359702302 Test accuracy: 1.0 → When the number of output nodes was changed from 16 to 128, the prediction accuracy improved.

--Changed RNN output activation function to sigmoid Epoch 1/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.2498 - accuracy: 0.5131 Epoch 2/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.2487 - accuracy: 0.5302 Epoch 3/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.2469 - accuracy: 0.5481 Epoch 4/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.2416 - accuracy: 0.6096 Epoch 5/5 10000/10000 [==============================] - 27s 3ms/step - loss: 0.2166 - accuracy: 0.7125 Test loss: 0.18766544096552618 Test accuracy: 0.7449744939804077 → When the output activation function was changed from ReLU to sigmoid, the prediction accuracy decreased.

--Changed RNN output activation function to tanh Epoch 1/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.1289 - accuracy: 0.8170 Epoch 2/5 10000/10000 [==============================] - 28s 3ms/step - loss: 0.0022 - accuracy: 1.0000 Epoch 3/5 10000/10000 [==============================] - 28s 3ms/step - loss: 7.1403e-04 - accuracy: 1.0000 Epoch 4/5 10000/10000 [==============================] - 27s 3ms/step - loss: 4.1603e-04 - accuracy: 1.0000 Epoch 5/5 10000/10000 [==============================] - 28s 3ms/step - loss: 2.8925e-04 - accuracy: 1.0000 Test loss: 0.00024679134038564263 Test accuracy: 1.0 → When the output activation function was changed from ReLU to Tanh, the prediction accuracy improved.

--Changed optimization method to adam Epoch 1/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.0694 - accuracy: 0.9385 Epoch 2/5 10000/10000 [==============================] - 32s 3ms/step - loss: 0.0012 - accuracy: 1.0000 Epoch 3/5 10000/10000 [==============================] - 31s 3ms/step - loss: 5.4037e-05 - accuracy: 1.0000 Epoch 4/5 10000/10000 [==============================] - 31s 3ms/step - loss: 3.3823e-06 - accuracy: 1.0000 Epoch 5/5 10000/10000 [==============================] - 31s 3ms/step - loss: 2.4213e-07 - accuracy: 1.0000 Test loss: 5.572893907904208e-08 Test accuracy: 1.0 → When the optimization method was changed from sgd to adam, the prediction accuracy improved.

--Set RNN input Dropout to 0.5 Epoch 1/5 10000/10000 [==============================] - 30s 3ms/step - loss: 0.2324 - accuracy: 0.5875 Epoch 2/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.2106 - accuracy: 0.6230 Epoch 3/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.2046 - accuracy: 0.6264 Epoch 4/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.2032 - accuracy: 0.6244 Epoch 5/5 10000/10000 [==============================] - 30s 3ms/step - loss: 0.1995 - accuracy: 0.6350 Test loss: 0.15461890560702712 Test accuracy: 0.8619986772537231 → When the input Dropout was set to 0.5, learning did not proceed.

--Set RNN recursive Dropout to 0.3 Epoch 1/5 10000/10000 [==============================] - 32s 3ms/step - loss: 0.1459 - accuracy: 0.8306 Epoch 2/5 10000/10000 [==============================] - 32s 3ms/step - loss: 0.0947 - accuracy: 0.9058 Epoch 3/5 10000/10000 [==============================] - 32s 3ms/step - loss: 0.0895 - accuracy: 0.9099 Epoch 4/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.0881 - accuracy: 0.9112 Epoch 5/5 10000/10000 [==============================] - 31s 3ms/step - loss: 0.0871 - accuracy: 0.9120 Test loss: 0.09418297858566198 Test accuracy: 0.9016276597976685 → When recursive Dropout was set to 0.3, the prediction accuracy decreased.

--Set RNN unroll to True Epoch 1/5 10000/10000 [==============================] - 20s 2ms/step - loss: 0.0938 - accuracy: 0.8871 Epoch 2/5 10000/10000 [==============================] - 20s 2ms/step - loss: 0.0032 - accuracy: 0.9999 Epoch 3/5 10000/10000 [==============================] - 20s 2ms/step - loss: 9.4733e-04 - accuracy: 1.0000 Epoch 4/5 10000/10000 [==============================] - 21s 2ms/step - loss: 5.3410e-04 - accuracy: 1.0000 Epoch 5/5 10000/10000 [==============================] - 20s 2ms/step - loss: 3.6424e-04 - accuracy: 1.0000 Test loss: 0.0003125413816745379 Test accuracy: 1.0 → When unroll is set to True (calculation is performed without looping), the prediction accuracy is slightly reduced. The calculation time is faster because the memory is consolidated.

Section2: Reinforcement learning

What is reinforcement learning?

A field of machine learning that aims to create agents who can choose actions in the environment to maximize rewards in the long run. → A mechanism to improve the principle of deciding an action based on the profit (reward) given as a result of the action.

In supervised learning and unsupervised learning, the goals are to make predictions from the data and to find out the patterns contained in the data. On the other hand, in reinforcement learning, the goal is to find excellent measures. With perfect knowledge of the environment in advance, it is possible to predict and determine optimal behavior, but in reinforcement learning, while acting based on incomplete knowledge, collect data and find optimal behavior. I will go.

--Q learning A method of advancing learning by updating the action value function each time an action is taken.

--Function approximation method A method of function approximation of value functions and policy functions.

Trade-off between exploration and utilization

If you always take only the best actions based on historical data, you cannot find other best actions. (Insufficient search) $ \ Longleftrightarrow $ If you always take only unknown actions, you will not be able to make use of your past experience (insufficient use)

Image of reinforcement learning

Behavioral value function

There are two types of functions that express value: a state value function (when focusing on a certain state) and an action value function (when focusing on the value that combines a state and value).

Policy function

A policy function is a function that gives the probability of what action to take in a certain state in a policy-based reinforcement learning method.

Policy gradient method

\theta^{(t+1)} = \theta^{(t)}+\epsilon \nabla J(\theta)

[PYTHON] [Rabbit Challenge (E qualification)] Deep learning (day4)

Introduction

List of subjects

Section1: Implementation exercise

Linear regression

Nonlinear regression

Classification 1 layer (mnist)

Classification 3 layers (mnist)

Classification CNN (mnist)

Linear regression

Simple perceptron

Simple_perceptron

python:np.random.seed()

Classification (iris)

classifiy_iris

Classification (mnist)

classify_mnist

CNN classification (mnist)

classify_mnist_with_CNN

classify_cifar10

RNN (Forecast of Binary Addition)

RNN (Forecast of Binary Addition)

Section2: Reinforcement learning

What is reinforcement learning?

Trade-off between exploration and utilization

Image of reinforcement learning

Behavioral value function

Policy function

Policy gradient method

`Simple_perceptron`

`python:np.random.seed()`

`classifiy_iris`

`classify_mnist`

`classify_mnist_with_CNN`

`classify_cifar10`

`RNN (Forecast of Binary Addition)`