[PYTHON] Correlation by data preprocessing

Make a rough guide to the correlation in machine learning data preprocessing.

I hope it helps.

train_corr = train.corr()
print(train_corr)

#-1.From 0-0.7 ... Strong negative correlation
#-0.From 7-0.4 ... Negative correlation
#-0.From 4-0.2 ... Weak negative correlation
#-0.From 2+0.2 ... Almost no correlation
#+0.From 2+0.4 ... Weak positive correlation
#+0.From 4+0.7 ... Positive correlation
#+0.From 7+1.0 ・ ・ ・ Strong positive correlation

Recommended Posts

Correlation by data preprocessing
Split data by threshold
Training data by CNN
Preprocessing of prefecture data
Classify data by k-means method
Gzip the data by streaming
Visualization of data by prefecture
Text data preprocessing (vectorization, TF-IDF)
Data acquired by Django releted
Python: Time Series Analysis: Preprocessing Time Series Data
Python Pandas Data Preprocessing Personal Notes
Preprocessing template for data analysis (Python)
Preprocessing in machine learning 2 Data acquisition
10 selections of data extraction by pandas.DataFrame.query
Animation of geographic data by geopandas
Time series analysis 3 Preprocessing of time series data
[Translation] scikit-learn 0.18 User Guide 4.3. Data preprocessing
Preprocessing in machine learning 4 Data conversion
Preprocessing of Wikipedia dump files and word-separation of large amounts of data by MeCab
SIGNATE Quest ① From data reading to preprocessing
Python: Preprocessing in machine learning: Data acquisition
Organize data divided by folder with Python
Python: Preprocessing in machine learning: Data conversion
Preprocessing in machine learning 1 Data analysis process