[PYTHON] Easily visualize the correlation coefficient between variables

Display a list of correlation coefficients

Use heatmap to display a list of correlation coefficients using kaggle Titanic Victim Data Note that you can only check by numerical value

corr.py


# lib install
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline 

train = pd.read_csv('./train.csv')
train.head()

2017-07-25_122911.png

corr.py


plt.figure(figsize=(8, 6)) #heatmap size
sns.heatmap(train.corr(), annot=True, cmap='plasma', linewidths=.5) # annot:Whether to display the value linewidths:Cut line

2017-07-25_123429.png

In this example, it's important to survive, so if you look at the Survived column, you'll find the variables that correlate.

Reference: Seaborn Heatmap

Recommended Posts

Easily visualize the correlation coefficient between variables
Rethink the correlation coefficient
Examine the relationship between two variables (1)
Easily organize the differences between Apache Tomcat
How to find the correlation for categorical variables
Visualize data and understand correlation at the same time
pca.components_ of sklearn is the correlation coefficient between the principal component and the feature, and is called the factor loading.
Understand the difference between cumulative assignment to variables and cumulative assignment to objects
[Statistics] Let's visualize the relationship between the normal distribution and the chi-square distribution.
Visualize the correlation matrix by principal component analysis in Python