[PYTHON] I tried to implement hierarchical clustering

Introduction

Last time, I implemented K-means method, but this time I implemented a hierarchical clustering method.

What is hierarchical clustering?

Hierarchical clustering is a method of forming clusters in order from the most similar combination.

Implementation of hierarchical clustering

The python code is below.

#Installation of required libraries
import numpy as np
import pandas as pd

#Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display
%matplotlib inline
sns.set_style('whitegrid')

#Class for normalization
from sklearn.preprocessing import StandardScaler

#Import what you need for hierarchical clustering
from scipy.cluster import hierarchy

First import the required libraries. I will try to implement it using iris data this time as well.

#iris data
from sklearn.datasets import load_iris

#Data read
iris = load_iris()
iris.keys()

df_iris = iris.data
df_target = iris.target
target_names = iris.target_names
df_labels = target_names[df_target]

#Data normalization (mean 0,Standard deviation 1)
scaler = StandardScaler()
df_iris_std = scaler.fit_transform(df_iris)

The pre-processing ends above. This time, I would like to carry out hierarchical clustering using Ward's method. There are many other ways to define the distance between clusters, so you need to choose the right method for your data.

#Distance calculation
dist = hierarchy.distance.pdist(df_iris_std, metric='euclidean')

#Clustering
linkage = hierarchy.linkage(dist, method='ward')

#Dendrogram
fig, ax = plt.subplots(figsize=(5,13))
ax = hierarchy.dendrogram(Z=linkage,
                orientation='right',
                labels=dataset_labels)
fig.show()

ウォード法.png

at the end

Thank you for reading to the end. This time, I implemented the hierarchical clustering method.

If you have a request for correction, we would appreciate it if you could contact us.

Recommended Posts

I tried to implement hierarchical clustering
I tried to implement StarGAN (1)
I tried to implement Deep VQE
I tried to implement adversarial validation
I tried to implement Realness GAN
I tried to implement PLSA in Python
I tried to implement Autoencoder with TensorFlow
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
I tried to implement PPO in Python
I tried to debug.
I tried to paste
I tried to implement reading Dataset with PyTorch
I tried to implement TOPIC MODEL in Python
I tried to implement selection sort in python
I tried to implement the traveling salesman problem
I tried to learn PredNet
I tried to organize SVM.
I tried clustering with PyCaret
I tried to reintroduce Linux
I tried to introduce Pylint
I tried to summarize SparseMatrix
I tried to touch jupyter
I tried to implement multivariate statistical process management (MSPC)
I tried to implement and learn DCGAN with PyTorch
I tried to implement Minesweeper on terminal with python
I tried to implement a pseudo pachislot in Python
I tried to implement a recommendation system (content-based filtering)
I tried to implement an artificial perceptron with python
I tried to implement time series prediction with GBDT
I tried to implement GA (genetic algorithm) in Python
I tried to implement Grad-CAM with keras and tensorflow
I tried to implement SSD with PyTorch now (Dataset)
I tried to implement automatic proof of sequence calculation
I tried to implement a volume moving average with Quantx
I tried to implement a basic Recurrent Neural Network model
I tried to implement anomaly detection by sparse structure learning
I tried to touch Python (installation)
I tried to implement a one-dimensional cellular automaton in Python
I tried to implement breakout (deception avoidance type) with Quantx
[Django] I tried to implement access control by class inheritance.
I tried to explain Pytorch dataset
I tried Watson Speech to Text
I tried to implement ListNet of rank learning with Chainer
I tried to implement the mail sending function in Python
I tried to implement Harry Potter sort hat with CNN
I tried to organize about MCMC.
I tried to implement Perceptron Part 1 [Deep Learning from scratch]
I tried to implement blackjack of card game in Python
I tried to move the ball
I tried to estimate the interval.
I tried to implement SSD with PyTorch now (model edition)
I tried to implement anomaly detection using a hidden Markov model
I tried to implement a misunderstood prisoner's dilemma game in Python
I tried to create a linebot (implementation)
I tried to summarize Python exception handling
I tried using Azure Speech to Text.
I tried to summarize the umask command
I tried to create a linebot (preparation)
I tried to visualize AutoEncoder with TensorFlow
I tried to recognize the wake word