[PYTHON] DBSCAN with scikit-learn

DBSCAN implemented in scikit-learn )

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import numpy as np
from sklearn import cluster

"""Specify parameters"""
dbscan = cluster.DBSCAN(eps=float(sys.argv[1]), min_samples=int(sys.argv[2]))

"""Read data"""
data_list = []
for line in open(sys.argv[3]):
    x = map(float, line.rstrip().split(' '))
    data_list.append(x)
data = np.array(data_list)

"""Clustering"""
dbscan.fit(data)

"""View results"""
labels = dbscan.labels_
for i in range(len(labels)):
    if labels[i] != -1:
        print labels[i], data[i]

How to use

Prepare the following file that describes the sample data in the row and the value of the attribute to be written in the column.

Execute by passing eps, min_samples, data_file in this order as arguments

>> python dbscan.py 1.5 3 data
0.0 [ 0.  1.]
1.0 [ 8.5  6. ]
0.0 [ 2.  0.]
0.0 [ 1.5  0. ]
0.0 [ 1.   1.5]
1.0 [ 10.   5.]
1.0 [ 9.  6.]
1.0 [ 8.   5.5]
1.0 [ 9.5  5.6]
0.0 [ 1.  0.]

dbscan.labels_ shows which cluster each sample was assigned to. When it is -1, it means that the noise cannot be assigned to any cluster.

Recommended Posts

DBSCAN with scikit-learn

Clustering with scikit-learn + DBSCAN

DBSCAN (clustering) with scikit-learn

Isomap with Scikit-learn

Clustering with scikit-learn (1)

Clustering with scikit-learn (2)

PCA with Scikit-learn

kmeans ++ with scikit-learn

Multi-class SVM with scikit-learn

Install scikit.learn with pip