[PYTHON] Introduction of scikit-optimize

This article is the 17th day article of Machine Learning Advent Calendar 2016.

This time, I would like to introduce a library called scikit-optimize that can estimate the parameters that minimize the black box function.

scikit-optimize official page (https://scikit-optimize.github.io/)
github page (https://github.com/scikit-optimize/scikit-optimize)

Installation

The environment tested this time is as follows.

Execution environment: MacOS Sierra
Python 3.5.2

Installation is easy from pip.

pip install scikit-optimize

Example

Getting Started in the README.md on github is a function with noise added. It would be nice to know the function, but in reality the function may be unknown. In such a case, if the function does not know the data point x but it can be evaluated, the x that can be minimized can be obtained by using a method called Bayesian Optimization.

import numpy as np
from skopt import gp_minimize

def f(x):
    return (np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) * np.random.randn() * 0.1)

res = gp_minimize(f, [(-2.0, 2.0)])

This res has the following variables.

fun: Minimum value of f (x)
func_vals: The value of f (x) obtained in each trial

fun = min(func_vals)

models: Models used in each trial
random_state: seed
space: Searched range (Space)
specs: List of parameters
x: The minimum value of x
x_iters: The value of x evaluated in each trial

For Machine Learning

Machine learning (especially supervised learning) aims to build models from datasets and improve predictive performance for unknown data. At that time, machine learning evaluates the model using cross-validation and various evaluation indexes. Furthermore, tuning hyperparameters is indispensable if you want to build a higher performance model. This time, I will try tuning this hyperparameter using skopt.

Preparation

Determine the model of data and machine learning. This page also has an example, but since it's a big deal, I'll try a slightly different model.

Data: breast_cancer
Model: GradientBoostedTreeClassifier (scikit-learn)
Rating: AUC

procedure

Prepare the data and model.

from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score

data = load_breast_cancer()
X, y = data.data, data.target
n_features = len(X)
model = GradientBoostingClassifier

Define a black box function.

def objective(params):
    max_depth, lr, max_features, min_samples_split, min_samples_leaf = params
    
    model.set_params(max_depth=max_depth,
                     max_features=max_features,
                     learning_rate=lr,
                     min_samples_split=min_samples_split,
                     min_samples_leaf=min_samples_leaf)
    
    # gp_Since minimize can only be minimized, it is necessary to use a negative value for an index that indicates that the higher the performance, the higher the performance.
    return -np.mean(cross_val_score(model, X, y, cv=5, scoring='roc_auc'))

Determine the parameter search range (Space).

space  = [(1, 5), (10**-5, 10**-1, "log-uniform"), (1, n_features), (2, 30), (1, 30)]

Determine the initial value of the search.

x0 = [3, 0.01, 6, 2, 1]

Use gp_minimize to estimate the hyperparameters to minimize.

res = gp_minimize(objective, space, x0=x0, n_calls=50)

print(res.fun) # -0.993707074488
print(res.x)   # [5, 0.096319962593215167, 1, 30, 22]

In this way, we were able to find the optimal hyperparameters. By the way, with this dataset, the time required for gp_minimize was 17 [s].

Others

The official website has some samples other than those described above.

Bayesian optimization
Hyperparameter optimization
Store and load results
Strategy comparison
Visualizing results