A typical clustering algorithm is k-means. Since k-means is a very simple algorithm, it can lead to unfortunate clustering results. Therefore, in this article, we will introduce the implementation of kernel k-means that maps the data space to a high dimension by a nonlinear function and performs clustering.
I tried clustering the following data with k-means.
At a glance, it seems that there are two clusters in the central part and the outer part, but the clustering result of k-means is like a straight line.
kernel k-means In kernel k-means, the data space is mapped to a high dimension by a nonlinear function and clustering is performed. In other words, when the data points are $ x \ in X $ and the nonlinear function $ \ phi $, clustering is performed for $ \ phi (x) $. There are many ways to choose the nonlinear function $ \ phi $, but the kernel function $ k (x_i, x_j) = \ phi (x_i) ^ T \ phi (x_j) $ is better than choosing $ \ phi $. Is often selected (kernel method).
The kernel functions are as follows.
Choosing a linear kernel is equivalent to k-means.
I tried clustering the previous data with kernel k-means. I set the Gaussian kernel for the kernel function and 0.1 for the value of $ \ gamma $. The source code has been uploaded to here.
You can see that clustering is possible between the central part and the outer part.
Since k-means and kernel k-means are algorithms that largely depend on the initial values, it is not always possible to perform clustering in this way. In kernel k-means, it is necessary to select kernel functions and set hyperparameters ...
(2015/7/2 Corrected that the figure was slightly different)
Recommended Posts