[PYTHON] Label each point on the seaborn scatter plot

The 42-year-old uncle who posted for the first time. I changed to seaborn.

trouble

I wrote a scatter plot and didn't know the correspondence between id and each point, so it was not good. When I looked it up, it seems that there is a way to label each point, so I will leave a memorandum.

Implementation method

 -*- coding utf-8 -*-
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt


def label_point(x,y,id,ax):
  df_tmp=pd.concat({'x':x,'y':y,'id':id}, axis='columns')
  for i, point in df_tmp.iterrows():
    ax.text(point['x'], point['y'], point['id'],\
    fontsize=5\
    )


if __name__=='__main__':
  rfilename='testdata.csv'
  data=pd.read_csv(rfilename)

 # Scatter plot drawing
  ax=sns.scatterplot(x=data.iloc[:,3], y=data.iloc[:,4], hue=data.iloc[:,5])

 # Labeling
  label_point(data.iloc[:,3],data.iloc[:,4],data.iloc[:,0],ax)

 #Save
  wfilename='testdraw.png'
  plt.savefig(wfilename)

Part of the data

This is a two-dimensional version of the language characteristics of travel composition, but details are omitted.

id sex age 1-dim 2-dim lab 1 1 63 147.05500793457 -81.8567810058594 me 2 2 45 -128.938018798828 88.1118698120117 fy 3 1 66 20.0744075775146 113.524360656738 me 4 1 68 -64.7453689575195 49.7739143371582 me 5 2 49 -26.7232112884521 -164.791641235352 fy

Drawing result

The id is a floating point number, but it's not the main subject, so leave it as it is. testdraw.png

Recommended Posts

Label each point on the seaborn scatter plot
Notes on coloring by value in the matplotlib scatter plot
Display histogram / scatter plot on Jupyter Notebook
Scatter plot
Draw a line / scatter plot on the CSV file (2 columns) with python matplotlib
Drawing on Jupyter using the plot function of pandas
Let's look at the scatter plot before data analysis
Seaborn basics for beginners ③ Scatter plot (jointplot) * With histogram