Overview

Purpose: Cluster customers from purchase history, etc. to understand their characteristics
Method: Clustering with KMeans
Output: Number of people in each cluster, characteristics of each customer cluster
Area of use: Marketing

approach

Store data in dataframe
Determine the number of clusters
Clustering with KMeans
Output the clustering result

code

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from collections import Counter

##Prepare dataframe df###

num_clus = 4 #Set the number of clusters
kmeans = KMeans(n_clusters=num_clus, random_state=0).fit(df)

print(Counter(kmeans.labels_)) #Output the number of people in each cluster

df['cluster_id']=kmeans.labels_ #Add cluster number to original dataframe

for i in range(0,num_clus): #Output the average value of each cluster
    print(df[df['cluster_id']==i].mean())

[PYTHON] [Roughly] Clustering by KMeans

Overview

approach

code