This is the second in the series. Is it possible to train a neural network (NN) to output the average of some given numerical data? And what about the standard deviation? let's do it.
The problems that NN can learn are roughly divided into "classification problems" and "regression problems". This time, it corresponds to the "regression problem".
Let's do it. In general, it is difficult to prepare learning data for NN training, but this time it is easy to prepare.
For the time being, I decided to prepare 50,000 training datasets. One set consists of 10 numbers. The 10 numbers are 10 random numbers according to the distribution data of mean a and standard deviation b using numpy's random.normal (a, b, 10), but here a and b themselves are also It is generated by random.rand () of numpy.
Calculate "10 numbers" and "the mean and standard deviation of these" and store them in the list first.
001.py
import numpy as np
trainDataSize = 50000 #Number of datasets to create
dataLength = 10 #Number of data per set
d = []#Fill in 10 empty lists each.
average_std = []#The second empty list. Enter two numbers at a time.
for num in range(trainDataSize):
xx = np.random.normal(np.random.rand(),np.random.rand(),dataLength)
average_std.append(np.mean(xx))
average_std.append(np.std(xx))
d.append(xx)
Once you have a list with all 50000 sets, convert it to ndarray again.
002.py
d = np.array(d) #Make it an ndarray.
average_std = np.array(average_std)#Make it an ndarray.
The reason why I don't use ndarray from the beginning is that it is slow.
002.py
#Bad code. Because it's late.
d = np.array([])#Empty numpy array
for num in range(trainDataSize):
xx = np.random.normal(np.random.rand(),np.random.rand(),dataLength)
d = np.append(d,xx) #This process is slow!
The created ndarray is the one in which numerical data is thrown in order. Now change the shape of the matrix.
003.py
d = d.reshape(50000,10)
average_std = average_std.reshape(50000,2)
Divide the 50000 dataset into two, 40000 sets for training and 10000 sets for evaluation. We will not consider hyperparameters this time, so we will divide it into two parts.
004.py
#Training in the first half 40,000. Evaluated in the latter half 10000.
d_training_x = d[:40000,:]
d_training_y = average_std[:40000,:]
d_test_x = d[40000:,:]
d_test_y = average_std[40000:,:]
The point is
005.py
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(100, activation='tanh', input_shape=(10,)))#There are 10 input slots.
model.add(Dense(100, activation='tanh'))
model.add(Dense(40, activation='sigmoid'))
model.add(Dense(20, activation='sigmoid'))
model.add(Dense(2, activation='linear')) #There are two output slots.
#Stochastic Gradient Descent Adam
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999)
#Loss function root mean square error
model.compile(loss='mean_squared_error',optimizer=optimizer)
model.summary() #NN summary output
Finally, I will throw in the training data.
006.py
history = model.fit(d_training_x, d_training_y,
batch_size=256,#The training data is thrown in at once for 256 sets of data.
epochs=20,#How many laps will the training data be repeated?
verbose=1,#Verbosity is redundant, which in turn means "talking." If you set it to 1, the training process will be output one by one.
validation_data=(d_test_x, d_test_y))
Here, the return value of fit () is stored in the variable history. If you look up the return value history with type (), it looks like an object. Let's check with vars ().
007.py
type(history) # <class 'keras.callbacks.callbacks.History'>
vars(history)
#A lot of information is output.
#If you take a look at the output information, the fields that the history object has are as follows.
# validation_data (list)、
# model (Reference to NN model)、
# params (dictionary. key is'batch_size'、'epochs'、'steps'、'samples'、'verbose'、'do_validation'、'metrics')
# epoch (list)、
# history (dictionary. key is'val_loss'、'loss')
#
#history key is'val_loss'When'loss'Is.
#loss is the loss on the training data. val_loss is the loss to the data for evaluation. Since the variable name is history here, history.history['val_loss']You can access the progress data of how the learning progressed.
Let's plot how learning progresses.
008.py
import matplotlib.pyplot as plt
plt.plot(history.history['val_loss'], label = "val_loss")
plt.plot(history.history['loss'], label = "loss")
plt.legend() #Show legend
plt.title("Can NN learn to calculate average and standard deviation?")
plt.xlabel("epoch")
plt.ylabel(" Loss")
plt.show()
The graph I wrote with this:
You can see that the learning has progressed, but how accurately have you been able to "calculate"? Throw the first 200 sets of evaluation data into the NN and plot the output (vertical axis) against the mathematical calculation results (horizontal axis).
009.py
#Give data to the trained NN
inp = d_test_x[:200,:]
out = d_test_y[:200,:]
pred = model.predict(inp, batch_size=1)
#Make a graph:average
plt.scatter(out[:,0], pred[:,0])
plt.legend() #Show legend
plt.title("average")
plt.xlabel("mathematical calculation")
plt.ylabel("NN output")
#Draw a line. If you get on this line, you can predict well.
x = np.arange(-0.5, 2, 0.01)
y = x
plt.plot(x, y)
plt.show()
You can see that the "calculation" is done with approximately high accuracy. Then what about the standard deviation?
009.py
#Make a graph:standard deviation
plt.scatter(out[:,1], pred[:,1])
plt.legend() #Show legend
plt.title("standard deviation")
plt.xlabel("mathematical calculation")
plt.ylabel("NN output")
x = np.arange(0, 1.5, 0.01)
y = x
plt.plot(x, y)
plt.show()
Is it a decent place? The average is better, but the standard deviation isn't enough.
Roughly speaking, the calculation performed by the neural network is to obtain the product of the input value * x * multiplied by each weight parameter * w *, and to obtain the output value by using the sum of these products as the input of the activation function. ,is.
As for the average, multiply each input value by 0.1 (in this case, 1/10 = 0.1 because there are 10 values to be input) and add them together to obtain the average, so NN calculates the average of 10 values with high accuracy. It's easy to imagine what you will be able to do.
On the other hand, what about the standard deviation? Calculate the mean, then add each of the input values to the mean multiplied by -1 (that is, take the difference from the mean), square it, add it, and divide by 9. , Should be the standard deviation. The tricky part of this process is squaring.
Internally, NN multiplies the fixed weight parameter and the input value, adds them, and passes them to the activation function. Is it really possible to return the squared value of any input value with almost no error? You should be able to express any curve by increasing the parameters, but I'm not sure what kind of calculation it will be.
Perhaps it would be nice if there was an activation function that squared (extended to the nth power) the input value. I would like to think about this somewhere.
Now that we have an NN that can output values close to the mean and standard deviation, I would like to conclude Part 2. Series 1st Preparation Series 2nd Mean and Standard Deviation Series 3rd Normal Distribution Series 4th Yen
Recommended Posts