What is Bayesian optimization?

Bayesian optimization is a method of optimizing some function using a machine learning method called Gaussian Process, which performs kernel regression in a Bayesian manner. The advantages of this Bayesian optimization are various, for example, it can be applied to functions whose inputs are not continuous (non-differentiable), and it is strong for functions with local solutions. Because of these merits, it is considered that it can be applied to various problems in the future. Here, we will explain how to use GPyOpt, which is a package that performs Bayesian optimization in Python.

GPyOpt installation

Use anaconda as the environment. You need the latest scipy and GPy (Gaussian process) packages.

$ conda update scipy
$ pip install GPy
$ pip install gpyopt

Function optimization with one-dimensional input (explanation)

First, let's optimize a nonlinear function whose input is one-dimensional. This time, we will minimize the function of $ f (x) = \ cos (1.5x) + 0.1x $ in the range of $ 0 \ leq x \ leq 10 $.

Bayesian optimization first randomly samples some inputs within a given range. Next, pass the input sample through the function and get the output sample. Then, using that sample, regression is performed in the Gaussian process. In this figure, green is the function to be optimized, and blue is the function regressed in the Gaussian process.

Now, let's find out where the function is minimized. What we do is to determine the smallest x and find the y, and so on. There are various criteria for where x is likely to be the minimum, but this time, x that extends to the smallest y in the blue area in this figure is set to x that is likely to be the minimum. I will. This blue area is the area that is likely to contain a green line given only the red dot data.

Now, let's sample one by this method. After sampling one, the important thing is that the blue area is changing. Then, in this state, by performing sampling → update of the blue area → sampling → update of the blue area → ..., it will be possible to find x that is the minimum of $ f (x) $.

Then, the figure after some sampling is shown. A lot of samples are actually sampled around $ x = 2 $, and the optimum solution is the one where $ f (x) $ is actually minimized in this sampling.

One-dimensional function optimization (GPyOpt)

Let's do it with GPyOpt.

First, import.

import GPy
import GPyOpt
import numpy as np

Then define the function you want to optimize. Here, you can specify the numpy function.

def f(x):
    '''
Non-linear function to be optimized this time
    '''
    return np.cos(1.5*x) + 0.1*x

Now, let's first define the domain of x. 'type':'continuous' indicates that it is continuous, and'domain': (0,10) indicates that it is $ 0 \ leq x \ leq 10 $. In the second line, we get the object for Bayesian optimization. f is the function you want to optimize, domain is the domain you defined earlier, and initial_design_numdata is the number of samples to get first. And acquisition_type refers to how to select x to be sampled next ('LCB' is to select the place where the blue area is minimized).

bounds = [{'name': 'x', 'type': 'continuous', 'domain': (0,10)}]
myBopt = GPyOpt.methods.BayesianOptimization(f=f, domain=bounds,initial_design_numdata=5,acquisition_type='LCB')

Then, use the following command to repeatedly sample and optimize. max_iter indicates the number of samplings.

myBopt.run_optimization(max_iter=15)

The optimum solution is displayed.

print(myBopt.x_opt) #[ 2.05769988]
print(myBopt.fx_opt) #[-0.79271554]

Other function variables that you might use

myBopt.model.model #Gaussian process model used in Bayesian optimization(GPy object)
myBopt.model.model.predict #Gaussian process regression function
myBopt.X,myBopt.Y #Sampled x and y

Function optimization with two-dimensional input

Bayesian optimization can also be applied in multiple dimensions. Here, two-dimensional Bayesian optimization is performed. Also, since the input variable may be a discrete variable, I will use a discrete variable for one dimension.

The function is $ f (x) = \ log (10.5-x_0) + 0.1 \ sin (15x_0) $ when $ x_1 = 0 $, and $ f (x) = \ cos (when $ x_1 = 1 $). 1.5x_0) + 0.1x_0 $. Also, we will minimize it in the range of $ 0 \ leq x_1 \ leq 10 $ ($ x_0 $ is 0 or 1).

After optimization, it looks like the following image.