Start studying: Saturday, December 7th
Teaching materials, etc .: ・ Miyuki Oshige "Details! Python3 Introductory Note ”(Sotec, 2017): 12/7 (Sat) -12/19 (Thu) read ・ Progate Python course (5 courses in total): 12/19 (Thursday) -12/21 (Saturday) end ・ Andreas C. Müller, Sarah Guido "(Japanese title) Machine learning starting with Python" (O'Reilly Japan, 2017): 12/21 (Sat) -December 23 (Sat) ・ Kaggle: Real or Not? NLP with Disaster Tweets: Posted on Saturday, December 28th to Friday, January 3rd Adjustment ・ Wes Mckinney "(Japanese title) Introduction to data analysis by Python" (O'Reilly Japan, 2018): 1/4 (Wednesday) to 1/13 (Monday) read ・ Yasuki Saito "Deep Learning from Zero" (O'Reilly Japan, 2016): 1/15 (Wed) -1/20 (Mon) ・ ** François Chollet “Deep Learning with Python and Keras” (Queep, 2018): 1/21 (Tue) ~ **
p.186 Chapter 5 Finished reading up to deep learning for computer vision.
Although the book explained about neural networks using handwritten characters of the keras dataset, I dared to classify them using the iris dataset of sklearn.
Import of various modules
from keras import models, layers
from keras.utils.np_utils import to_categorical
from sklearn import datasets
from sklearn.model_selection import train_test_split
Preparation
#Reading toy data
iris = datasets.load_iris()
#Check the contents of the directory,'data'When'target'Define each
dir(iris)
x = iris.data
y = iris.target
#Check the data shape
print([x.shape, y.shape])
print(iris.fearture_names)
print(x[0])
x.shape = (150, 4), y.shape = (150, )
Check the element of x in iris.feature_names.
Confirm that the four columns of x correspond to ['sepal length (cm)','sepal width (cm)','petal length (cm)','petal width (cm)'], respectively.
y is 0 to 2 and corresponds to 3 types of irises.
Normalization
def normalization(x):
mean = x.mean(axis = 0)
std = x.std(axis = 0)
x -= mean
x /= std
return x
x = normalization(x)
y = to_categorical(y)
By subtracting the mean from x and dividing by the standard deviation, each element is normalized so that the center is 0 and the standard deviation is 1. y also vectorizes 0 or 1 with to_categorical.
Separated for training and testing
x_train, x_test, y_train, y_test =
train_test_split(x, y, train_size = 0.7, random_state = 3)
print([x_train.shape, x_test.shape, y_train.shape, y_test.shape])
Divided for training and for testing at 7: 3. Since Shuffle is True by Default, only random_state is set.
The elements are [(105, 4), (45, 4), (105, 3), (45, 3)], respectively, and it is confirmed that the division was successful.
Model construction: 2 layers
model = models.Sequential()
model.add(layers.Dense(64, activation = 'relu', input_shape = (x.shape[1], )))
model.add(layers.Dense(64, activation = 'relu'))
model.add(layers.Dense(3, activation = 'softmax'))
model.summary()
model.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy', metrics = ['accuracy'])
ReLU is used for the activation function of each layer, and softmax is used for the activation function of the last layer. (See p.119 Table 4-1)
Visualize each layer with model.summary ()
For the optimizer, I chose this from the description that rmsprop is almost enough.
Learning and evaluation
model.fit(x_train, y_train, epochs = 100, batch_size = 1, verbose=0)
result = model.evaluate(x_test, y_test)
print(result)
#[0.15816927485995821, 0.9555555582046509]
epochs is 100 for the time being, batch_size is 1 as a matter of course.
As mentioned above, although it was a simple structure, the result was 96% accuracy.
Since there are three types of irises, the error that occurred because ** layers.Dense (3) ** should be changed to ** layers.Dense (1) . It needs to be changed according to the number of outputs required. ( 10 pieces from 0 to 9 for handwritten characters **)
This is not an error, but if recording is not required, it can be dealt with by setting the argument verbose = 0.
Recommended Posts