Let's implement SVM in Pattern Recognition and Machine Learning, Chapter 7, "Sparse Kernel Machines". In reproducing Figure 7.2, it seems that the Gaugeian kernel is selected as the kernel, so this implementation will follow this as well.
As usual, the explanation of SVM itself, such as how to find the boundary surface, will be left to PRML and Hajipata, and only the flow of implementation will be shown briefly.
(1) Solve the dual representation (7.10) of the Lagrange function by the quadratic programming method.
\tilde{L}(a) = \sum_{n=1}^N a_n - \frac{1}{2} \sum_{n=1}^N \sum_{m=1}^N a_na_m t_n t_m k({\bf x_n}, {\bf x_m}) (7.10)
Implementation of nonlinear programming on your own is unreasonable! So I used the solver quietly. I installed the library called cvxopt, which is free and popular.
As mentioned above, it is hard to see if it is still (7.10) to model with cvxopt according to the document, so formula transformation.
- \tilde{L}(a) = \frac{1}{2} {\bf A}^{\mathrm{T}} ({\bf T}^{\mathrm{T}} k({\bf x_n}, {\bf x_m}) {\bf T}){\bf A} - {\bf A} (7.10)'
The notation is unpleasant, but if you can model while comparing (7.10)'and cvxopt for the time being, you can get the Lagrange multiplier $ a $ by having cvxopt solve it as it is.
(2) In SVM, only the support vector is involved in the determination of the boundary surface, and this is narrowed down by the KKT condition. Somehow I'm not hungry yet, but the vector on the interface is $ t_ny ({\ bf x})-1 = 0 $, so from (7.16), $ a> 0 $ must be used. Interpreted as. Use this to extract the support vector.
In ③, find $ b $ of (7.18) using the obtained support vector.
④ Substitute the value obtained in (7.13) to obtain the boundary surface.
This time, I had a lot of trouble drawing the figure. I couldn't plot the boundary surface well, so I referred to aidiary's this article. Thank you very much.
import math
import matplotlib.pyplot as plt
from scipy import stats
from scipy.stats.kde import gaussian_kde
from pylab import *
import numpy as np
import random
import cvxopt
from cvxopt import matrix, solvers
%matplotlib inline
def kernel(x, y):
sigma = 5.0
return np.exp(-norm(x-y)**2 / (2 * (sigma ** 2)))
#(7.10)' (Quadratic Programming)
def L(t, X, N):
K = np.zeros((N, N))
for i in range(N):
for j in range(N):
K[i, j] = t[i] * t[j] * kernel(X[i], X[j])
Q = matrix(K)
p = matrix(-1 * np.ones(N))
G = matrix(np.diag([-1.0]*N))
h = matrix(np.zeros(N))
A = matrix(t, (1,N))
b = matrix(0.0)
sol = solvers.qp(Q, p, G, h, A, b)
a = array(sol['x']).reshape(N)
return a
#(7.13)
def y_x(a, t, X, N, b, x):
sum = 0
for n in range(N):
sum += a[n] * t[n] * kernel(x, X[n])
return sum + b
#(7.18)
def b(a, t, X, S):
sum_A = 0
for n in S:
sum_B = 0
for m in S:
sum_B += a[m] * t[m] * kernel(X[n], X[m])
sum_A += (t[n] - sum_B)
return sum_A/len(S)
if __name__ == "__main__":
N = 36
mu_blue = [1,-1]
cov = [[0.1,0.05], [0.05,0.1]]
x_blue,y_blue = np.random.multivariate_normal(mu_blue, cov, N/2).T
x_red = [0.3, 0.8, 0.9, 0.95, 1.1, 1.3, 1.6, 1.9, 1.75, 1.8, 2.0, 2.1, 2.3, 2.25, 2.4, 2.7, 3.0, 3.2]
y_red = [-0.2, 0.1, 0.25, 0.14, -0.1, 1.6, 1.2, 0.6, 0.8, -0.6, -0.8, -0.75, 1.2, -1.15, -0.12, -0.3, -0.4, 1.4]
t_blue = np.ones((1, N/2))
t_red = -1*np.ones((1, N/2))
blue = vstack((x_blue, y_blue))
red = vstack((x_red, y_red))
X = np.concatenate((blue, red), axis=1).T
t = np.concatenate((t_blue, t_red), axis=1).T
#(7.10)' (Quadratic Programming)
a = L(t, X, N)
#Extract Index of support vectors from (7.14)
S = []
for n in range(len(a)):
if a[n] < 0.0001: continue
S.append(n)
#(7.18)
b = b(a, t, X, S)
#Plot train data sets
plt.scatter(x_blue,y_blue,color='b',marker='x')
plt.scatter(x_red,y_red,color='r',marker='x')
# Enphasize suport vectors
for n in S:
plt.scatter(X[n,0], X[n,1], color='g', marker='o')
# Plot the decision surface
X1, X2 = meshgrid(linspace(-10,10,100), linspace(-10,10,100))
w, h = X1.shape
X1.resize(X1.size)
X2.resize(X2.size)
Z = array([y_x(a, t, X, N, b, array([x1,x2])) for (x1, x2) in zip(X1, X2)])
X1.resize((w, h))
X2.resize((w, h))
Z.resize((w, h))
CS = contour(X1, X2, Z, [0.0], colors='k', linewidths=1, origin='lower')
xlim(0, 4)
ylim(-2, 2)
title("Figure 7.2")
Recommended Posts