Data visualization in Python-draw cool heatmaps

Graph drawing in Python

The standard for drawing Python charts is "matplotlib", but it looks a little unfashionable and The complexity of the notation has been pointed out.

Therefore, in this article, I will discuss how to use "Seaborn", which is a wrapper to realize the functions of Matplotlib more beautifully and more easily. See the links below for more information on Seaborn and how to use it in a rudimentary way.

◆ Beautiful graph drawing with python -Use seaborn to improve data analysis and visualization Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0

Heat map

With seaborn you can draw a beautiful heatmap as below (Excerpt from Seaborn's Tutorial site)

It also has an impact on the appearance, and it is useful for people who are not very good at numbers because it is good for people. I think it's worth remembering how to use it.

image

Reference) http://stanford.edu/~mwaskom/software/seaborn/examples/many_pairwise_correlations.html

manner

Let's explain how to do it. Import the following libraries you need

prepare.py


import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

Prepare the data and convert it to a format that can be read. Seaborn comes bundled with some datasets by default, so let's use that. This time I will use the data "flights"

data.py


df_flights = sns.load_dataset('flights')
df_flights.head(5)

If you look at the head, you can see that the data is extended vertically for year and month.

data.py


	year	month	    passengers
0	1949	January	    112
1	1949	February    118
2	1949	March	    132
3	1949	April	    129
4	1949	May         121

Let's say you're curious about the trends in passengers on the two axes of year and month. In other words, I will draw a heatmap for x-year and y-month.

A heat map can be drawn with a function called sns.heatmap, but it is necessary to devise the data to be eaten. It is necessary to change to Pivot format that has the axis you want to bring to x in index and the axis you want to bring to y in column.

data.py


df_flights_pivot = pd.pivot_table(data=df_flights, values='passengers', 
                                  columns='year', index='month', aggfunc=np.mean)

If you are not familiar with data processing with Python Pandas, please refer to the following.

Reference: Data processing with Python

A rudimentary summary of data manipulation in Python Pandas-first half & second half http://qiita.com/hik0107/items/d991cc44c2d1778bb82e http://qiita.com/hik0107/items/0ae69131e5317b62c3b7

All you have to do now is give seaborn a Pivot-formatted dataframe.

draw.py


sns.heatmap(df_flights_pivot)

A figure like this is displayed. The number of passengers has increased year by year since 1949, especially around July-August. You can see that the number of passengers is particularly large in one shot. Also, it seems that the number of customers will settle down a little in November every year and will increase again in December.

image

A little more makeup

You can leave the above figure as it is, but let's apply makeup to change the appearance a little more. For example, it looks like this

draw.py


plt.figure(figsize=(12, 9))
sns.heatmap(df_flights_pivot, annot=True, fmt='g', cmap='Blues')

annot is an argument to write a number to a cell, fmt is an adjustment of the digit of the number, cmap is Color_map, Specifies a palette of graduation colors.

It looks like this. This is better when you want to discuss while looking at specific numerical values.

image

This article also

Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 1 http://qiita.com/hik0107/items/3dc541158fceb3156ee0

Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 2 http://qiita.com/hik0107/items/7233ca334b2a5e1ca924

If you are interested in data scientists, first look around here, a summary of literature and videos http://qiita.com/hik0107/items/ef5e044d2f47940ba712

Recommended Posts

Data visualization in Python-draw cool heatmaps
Real-time visualization of thermography AMG8833 data in Python
Power BI visualization of Salesforce data entirely in Python
Python Data Visualization Libraries
Sampling in imbalanced data
Data visualization with pandas
Handle Ambient data in Python
Data Manipulation in Python-Try Pandas_plyr
Display UTM-30LX data in Python
Write data in HDF format
Visualization of data by prefecture
Python application: data visualization # 2: matplotlib
Data visualization method using matplotlib (1)
Data visualization method using matplotlib (2)