This article is the 17th day article of Machine Learning Advent Calendar 2016.
This time, I would like to introduce a library called scikit-optimize that can estimate the parameters that minimize the black box function.
Installation
The environment tested this time is as follows.
Installation is easy from pip.
pip install scikit-optimize
Example
Getting Started in the README.md on github is a function with noise added. It would be nice to know the function, but in reality the function may be unknown. In such a case, if the function does not know the data point x but it can be evaluated, the x that can be minimized can be obtained by using a method called Bayesian Optimization.
import numpy as np
from skopt import gp_minimize
def f(x):
return (np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) * np.random.randn() * 0.1)
res = gp_minimize(f, [(-2.0, 2.0)])
This res has the following variables.
fun = min(func_vals)
For Machine Learning
Machine learning (especially supervised learning) aims to build models from datasets and improve predictive performance for unknown data. At that time, machine learning evaluates the model using cross-validation and various evaluation indexes. Furthermore, tuning hyperparameters is indispensable if you want to build a higher performance model. This time, I will try tuning this hyperparameter using skopt.
Determine the model of data and machine learning. This page also has an example, but since it's a big deal, I'll try a slightly different model.
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score
data = load_breast_cancer()
X, y = data.data, data.target
n_features = len(X)
model = GradientBoostingClassifier
def objective(params):
max_depth, lr, max_features, min_samples_split, min_samples_leaf = params
model.set_params(max_depth=max_depth,
max_features=max_features,
learning_rate=lr,
min_samples_split=min_samples_split,
min_samples_leaf=min_samples_leaf)
# gp_Since minimize can only be minimized, it is necessary to use a negative value for an index that indicates that the higher the performance, the higher the performance.
return -np.mean(cross_val_score(model, X, y, cv=5, scoring='roc_auc'))
space = [(1, 5), (10**-5, 10**-1, "log-uniform"), (1, n_features), (2, 30), (1, 30)]
x0 = [3, 0.01, 6, 2, 1]
res = gp_minimize(objective, space, x0=x0, n_calls=50)
print(res.fun) # -0.993707074488
print(res.x) # [5, 0.096319962593215167, 1, 30, 22]
In this way, we were able to find the optimal hyperparameters. By the way, with this dataset, the time required for gp_minimize was 17 [s].
Others
The official website has some samples other than those described above.
Recommended Posts