[PYTHON] Parameter optimization automation with Keras with GridSearch CV

I tried GridSearchCV with Keras

The model accuracy of machine learning depends on the parameters. Many parameters are set when building a model, such as activation functions, optimization algorithms, and the number of units in the middle layer, but it is not known until training and practical use whether the parameters set at that time are optimal.

However, the appeal of machine learning is that it automatically generates the optimal model. If so, the parameters may be optimized automatically! I think.

Scikit-learn, which is famous for Python machine learning, has a library called Gridsearchcv that allows model selection and parameter tuning.

http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

In fact, Keras is a wrapper for scikit-learn, and Gridsearch cv can be used when building Keras models.

https://keras.io/ja/scikit-learn-api/

So, let's try Keras Gridsearch cv immediately.

Target

Create an optimal model using Gridsearch cv in Keras. The data is categorized using Iris data, which is very popular with everyone.

https://en.wikipedia.org/wiki/Iris_flower_data_set

coding

Let's write it right away.

First, import what you need. Iris data is the one provided by sklearn.

import numpy as np
from sklearn import datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras import backend as K
from keras.wrappers.scikit_learn import KerasClassifier

Divide the Iris data into 7: 3 training and testing.

iris = datasets.load_iris()
x = preprocessing.scale(iris.data)
y = np_utils.to_categorical(iris.target)
x_tr, x_te, y_tr, y_te = train_test_split(x, y, train_size  = 0.7)
num_classes = y_te.shape[1]

Define the neural network model as a function. Here, the number of layers is defined and the argument has a parameter.

def iris_model(activation="relu", optimizer="adam", out_dim=100):
    model = Sequential()
    model.add(Dense(out_dim, input_dim=4, activation=activation))
    model.add(Dense(out_dim, activation=activation))
    model.add(Dense(num_classes, activation="softmax"))
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

Define choices for each parameter. Gridsearch cv verifies all patterns of the parameters defined here.

activation = ["relu", "sigmoid"]
optimizer = ["adam", "adagrad"]
out_dim = [100, 200]
nb_epoch = [10, 25]
batch_size = [5, 10]

Load the model functions and parameters into Gridsearchcv. I read the model with KerasClassifier and set the parameters for dict. It is a mechanism to combine both with GridSearchCV.

model = KerasClassifier(build_fn=iris_model, verbose=0)
param_grid = dict(activation=activation, 
                  optimizer=optimizer, 
                  out_dim=out_dim, 
                  nb_epoch=nb_epoch, 
                  batch_size=batch_size)
grid = GridSearchCV(estimator=model, param_grid=param_grid)

Start training!

grid_result = grid.fit(x_tr, y_tr)

... 30 minutes to wait ... ・ ・ ・ Although it is a classification of Iris data, it takes time if it is a CPU. ・ ・ ・ ... Is it faster with GPGPU? ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ... I want a GPU ... ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ... Done! ・ ・ ・

Output the result. Best score and its parameters.

print (grid_result.best_score_)
print (grid_result.best_params_)

2017-02-08_4.PNG

95% ... OK.

Now let's validate the model with the test data we left first. By the way, if you gridsearch cv Keras, it seems that model.evaluate cannot be done. Therefore, the correct answer and the estimated value of the test data are compared with analog.

grid_eval = grid.predict(x_te)
def y_binary(i):
    if   i == 0: return [1, 0, 0]
    elif i == 1: return [0, 1, 0]
    elif i == 2: return [0, 0, 1]
y_eval = np.array([y_binary(i) for i in grid_eval])
accuracy = (y_eval == y_te)
print (np.count_nonzero(accuracy == True) / (accuracy.shape[0] * accuracy.shape[1]))

2017-02-08_5.PNG

98%! It feels pretty good.

The model looks like this.

model = iris_model(activation=grid_result.best_params_['activation'], 
                   optimizer=grid_result.best_params_['optimizer'], 
                   out_dim=grid_result.best_params_['out_dim'])
model.summary()

2017-02-08_6.PNG

How is it? The Iris data has some outliers, and even if you do your best, it will not be 100%. It takes time because the training is done by combining the parameters, but it is easier than manually searching for the parameters.

Below is the full code.

import numpy as np
from sklearn import datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras import backend as K
from keras.wrappers.scikit_learn import KerasClassifier


# import data and divided it into training and test purposes
iris = datasets.load_iris()
x = preprocessing.scale(iris.data)
y = np_utils.to_categorical(iris.target)
x_tr, x_te, y_tr, y_te = train_test_split(x, y, train_size  = 0.7)
num_classes = y_te.shape[1]


# Define model for iris classification
def iris_model(activation="relu", optimizer="adam", out_dim=100):
    model = Sequential()
    model.add(Dense(out_dim, input_dim=4, activation=activation))
    model.add(Dense(out_dim, activation=activation))   
    model.add(Dense(num_classes, activation="softmax"))
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

# Define options for parameters
activation = ["relu", "sigmoid"]
optimizer = ["adam", "adagrad"]
out_dim = [100, 200]
nb_epoch = [10, 25]
batch_size = [5, 10]


# Retrieve model and parameter into GridSearchCV
model = KerasClassifier(build_fn=iris_model, verbose=0)
param_grid = dict(activation=activation, 
                  optimizer=optimizer, 
                  out_dim=out_dim, 
                  nb_epoch=nb_epoch, 
                  batch_size=batch_size)
grid = GridSearchCV(estimator=model, param_grid=param_grid)


# Run grid search
grid_result = grid.fit(x_tr, y_tr)


# Get the best score and the optimized mode
print (grid_result.best_score_)
print (grid_result.best_params_)

# Evaluate the model with test data
grid_eval = grid.predict(x_te)
def y_binary(i):
    if   i == 0: return [1, 0, 0]
    elif i == 1: return [0, 1, 0]
    elif i == 2: return [0, 0, 1]
y_eval = np.array([y_binary(i) for i in grid_eval])
accuracy = (y_eval == y_te)
print (np.count_nonzero(accuracy == True) / (accuracy.shape[0] * accuracy.shape[1]))


# Now see the optimized model
model = iris_model(activation=grid_result.best_params_['activation'], 
                   optimizer=grid_result.best_params_['optimizer'], 
                   out_dim=grid_result.best_params_['out_dim'])
model.summary()

Recommended Posts

Parameter optimization automation with Keras with GridSearch CV
Tuning hyperparameters with GridSearch using pipeline with keras
Learn Wasserstein GAN with Keras model and TensorFlow optimization
Image recognition with keras
Road installation with optimization
Parameter tuning with luigi (2)
Parameter tuning with luigi
CIFAR-10 tutorial with Keras
Getting Started with Optimization
Multivariate LSTM with Keras
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization