CMA-ES Continuing from the last time and the last time, I will continue the introduction of Python Evolutionary Computation Library Deap. This time we will look at CMA-ES. First of all, I would like to explain what CMA-ES is like. CMA-ES stands for * Covariance Matrix Adaptation Evolution Strategy * and is an optimization calculation for nonlinear / discontinuous functions. Briefly, the search population is generated using the multivariate normal distribution, and the mean and variance-covariance matrix of the multivariate normal distribution are updated from the evaluation of the population. Now, assuming that the search space is $ n $ dimension and $ \ lambda (> 1) $ population generation is performed at the $ k $ step, solution candidates $ x_i \ in {\ from the following multivariate normal distribution bf R} ^ n (i = 1, ..., \ lambda) $ is generated.

x_i ~ \sim N(m_k, \sigma_k^2 C_k)

In CMA-ES, this $ m_k, \ sigma_k ^ 2, C_k $ will be updated step by step.

Rastrigin function

This time, we will optimize a function called the Rastrigin function. The Rastrigin function is often used as a benchmark function for optimization algorithms and is expressed by the following formula.

f(x) = An + \sum_{i=1}^n [ x_i^2 - A\cos(2\pi x_i ) ]

Here, $ A = 10, x_i \ in [-5.12, 5.12] $, and the minimum point is $ x = 0 $ and $ f (x) = 0 $. When I try to draw a little using matplotlib ...

from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import matplotlib.pyplot as plt

from deap import benchmarks
import numpy as np

fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(-5.12, 5.12, 0.1)
Y = np.arange(-5.12, 5.12, 0.1)
X, Y = np.meshgrid(X, Y)
Z = [[benchmarks.rastrigin((x, y))[0] for x, y in zip(xx, yy)] for xx, yy in zip(X, Y)]
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0, antialiased=False)
plt.show()

A function that looks like this will come out.

Sample code

In order to confirm what kind of variance-covariance matrix is generated in the middle of CMA-ES, the variance-covariance matrix is drawn at each step.

# -*- coding: utf-8 -*-
import numpy

from deap import algorithms
from deap import base
from deap import benchmarks
from deap import cma
from deap import creator
from deap import tools

import matplotlib.pyplot as plt

N = 2  #Problem dimension
NGEN = 50  #Total number of steps

creator.create("FitnessMin", base.Fitness, weights=(-1.0,))
creator.create("Individual", list, fitness=creator.FitnessMin)

toolbox = base.Toolbox()
toolbox.register("evaluate", benchmarks.rastrigin)

def main():
    numpy.random.seed(64)

    # The CMA-ES algorithm 
    strategy = cma.Strategy(centroid=[5.0]*N, sigma=3.0, lambda_=20*N)
    toolbox.register("generate", strategy.generate, creator.Individual)
    toolbox.register("update", strategy.update)

    halloffame = tools.HallOfFame(1)

    halloffame_array = []
    C_array = []
    centroid_array = []
    for gen in range(NGEN):
        #Generate a new generation of population
        population = toolbox.generate()
        #Population evaluation
        fitnesses = toolbox.map(toolbox.evaluate, population)
        for ind, fit in zip(population, fitnesses):
            ind.fitness.values = fit
        
        #Parameter updates for next-generation calculations from population evaluation
        toolbox.update(population)
        
        # hall-of-update of fame
        halloffame.update(population)

        halloffame_array.append(halloffame[0])
        C_array.append(strategy.C)
        centroid_array.append(strategy.centroid)

    #Draw calculation result
    import matplotlib.pyplot as plt
    import matplotlib.cm as cm
    from matplotlib.patches import Ellipse
    plt.ion()
    fig = plt.figure()
    ax = fig.add_subplot(111)
    X = numpy.arange(-5.12, 5.12, 0.1)
    Y = numpy.arange(-5.12, 5.12, 0.1)
    X, Y = numpy.meshgrid(X, Y)
    Z = [[benchmarks.rastrigin((x, y))[0] for x, y in zip(xx, yy)]
         for xx, yy in zip(X, Y)]
    ax.imshow(Z, cmap=cm.jet, extent=[-5.12, 5.12, -5.12, 5.12])
    for x, sigma, xmean in zip(halloffame_array, C_array, centroid_array):
        #Draw the variance of the multivariate distribution as an ellipse
        Darray, Bmat = numpy.linalg.eigh(sigma)
        ax.add_artist(Ellipse((xmean[0], xmean[1]),
                              numpy.sqrt(Darray[0]),
                              numpy.sqrt(Darray[1]),
                              numpy.arctan2(Bmat[1, 0], Bmat[0, 0]) * 180 / numpy.pi,
                              color="g",
                              alpha=0.7))
        ax.plot([x[0]], [x[1]], c='r', marker='o')
        ax.axis([-5.12, 5.12, -5.12, 5.12])
        plt.draw()
    plt.show(block=True)

if __name__ == "__main__":
    main()

You can see that the green ellipse represents the covariance matrix of each step, and the ellipse becomes smaller and the search range becomes narrower as it approaches the optimal solution $ [0,0] $. When you actually execute the above script, the ellipse at the new step will be added and drawn in an animation, so I think that the change of the ellipse will be easier to understand. The red dots represent the best solution for each step, which also ends up in $ [0,0] $.

reference

CMA-ES Example: http://deap.gel.ulaval.ca/doc/default/examples/cmaes_plotting.html

Python Evolutionary Computation Library Deap (3)

Rastrigin function

Sample code

reference