[PYTHON] Data visualization method using matplotlib (+ pandas) (5)

This is the final episode of the data visualization story that continued until previous.

Scatter plot

We will use the data from pydata-book as before.

pydata-book/ch08/macrodata.csv https://github.com/pydata/pydata-book/blob/master/ch08/macrodata.csv

import numpy as np
from pandas import *
import matplotlib.pyplot as plt

#Read CSV data
macro = read_csv('macrodata.csv')

#Pick up some columns
data = macro[['cpi', 'm1', 'tbilrate', 'unemp']]

# .diff()The method changes the value to the difference from the previous row
#Because it starts with NaN.dropna()Remove with method
trans_data = np.log(data).diff().dropna()

# trans_data will be a dataset showing the changes from the previous row
#Show last 5 lines
print( trans_data[-5:] )
# =>
#           cpi        m1  tbilrate     unemp
# 198 -0.007904  0.045361 -0.396881  0.105361
# 199 -0.021979  0.066753 -2.277267  0.139762
# 200  0.002340  0.010286  0.606136  0.160343
# 201  0.008419  0.037461 -0.200671  0.127339
# 202  0.008894  0.012202 -0.405465  0.042560

#Plot a scatter plot from two columns
plt.scatter(trans_data['m1'], trans_data['unemp'])

plt.show()
plt.savefig("image.png ")

image.png

Scatterplot matrix

[Scatter Plot Matrix](http://www.okada.jp.org/RWiki/?%A5%B0%A5%E9%A5%D5%A5%] is a scatter plot of all pairs of a series of variables. A3% A5% C3% A5% AF% A5% B9% BB% B2% B9% CD% BC% C2% CE% E3% BD% B8% A1% A7% BB% B6% C9% DB% BF% DE% It is B9% D4% CE% F3). You can create this with the scatter_matrix function.

#Generate a scatterplot matrix
from pandas.tools.plotting import scatter_matrix
scatter_matrix(trans_data, diagonal='kde', color='k', alpha=0.3)

plt.show()
plt.savefig("image2.png ")

image2.png

It serves as a simple and powerful way to look at the correlation of any two 1D data.

reference

Introduction to data analysis with Python-Data processing using NumPy and pandas http://www.oreilly.co.jp/books/9784873116556/

Recommended Posts

Data visualization method using matplotlib (+ pandas) (5)
Data visualization method using matplotlib (+ pandas) (3)
Data visualization method using matplotlib (+ pandas) (4)
Data visualization method using matplotlib (1)
Data visualization method using matplotlib (2)
Data visualization with pandas
Implement "Data Visualization Design # 3" with pandas and matplotlib
Python application: data visualization # 2: matplotlib
Data analysis using python pandas
Versatile data plotting with pandas + matplotlib
Graph time series data in Python using pandas and matplotlib
Cases using pandas plot, cases using (pure) matplotlib plot
Implement "Data Visualization Design # 2" with matplotlib
Read pandas data
Visualization of latitude / longitude coordinate data (assuming meteorological data) using cartopy and matplotlib
Try using PHATE, a dimensionality reduction and visualization method for biological data
Try using matplotlib
Read Python csv data with Pandas ⇒ Graph with Matplotlib
[Pandas] Basics of processing date data using dt
100 language processing knock-20 (using pandas): reading JSON data
100 language processing knock-98 (using pandas): Ward's method clustering
100 language processing knock-99 (using pandas): visualization by t-SNE
Data analysis using xarray
Analysis of financial data by pandas and its visualization (2)
Get Amazon RDS (PostgreSQL) data using SQL with pandas
Python Data Visualization Libraries
Analysis of financial data by pandas and its visualization (1)
Cross tabulation using Pandas
Data analysis using Python 0
How to scrape horse racing data using pandas read_html
Graph drawing using matplotlib
[Latest method] Visualization of time series data and extraction of frequent patterns using Pan-Matrix Profile
Data cleansing 2 Data cleansing using DataFrame
I tried using matplotlib
Data cleaning using Python
I tried clustering ECG data using the K-Shape method
[Python] Summary of table creation method using DataFrame (pandas)
Data manipulation with Pandas!
Aggregate event data into one-user, one-line format using pandas
Process csv data with python (count processing using pandas)
Shuffle data with pandas
Visualization method of data by explanatory variable and objective variable
[Memo] Text matching in pandas data frame using flashtext
Method call using __getattr__
[Numpy / pandas / matplotlib Exercise 01]
Instantly create a diagram of 2D data using python's matplotlib
How to add new data (lines and plots) using matplotlib
Easy-to-understand [Pandas] practice / data confirmation method for high school graduates
Analyze stock prices using pandas data aggregation and group operations
[Python] Random data extraction / combination from DataFrame using random and pandas
pandas Matplotlib Summary by usage
Draw multiple graphs using Pandas
Try using matplotlib with PyCharm
Select features using text data
Classify data by k-means method
Graph drawing method with matplotlib
Visualization of data by prefecture
Graph Excel data with matplotlib (2)
Linear regression method using Numpy
Visualization memo by pandas, seaborn
Behavior of pandas rolling () method