[PYTHON] Implement "Data Visualization Design # 3" with pandas and matplotlib

What is a data visualization design?

This is a summary of data visualization points published in note by Go Ando of THE GUILD, who is famous for services that focus on UX and UI.

https://note.mu/goando/n/n99f6c395ae8a

What about # 1 and # 2?

Notes

plt.rcParams['font.family'] = 'Hiragino Sans'  

It is the part of.

13. It is effective to put a graph in the table

picture_pc_1bd02bdb34795599c91bc4e22a5657e2.png

Use pandas.

import pandas as pd
import numpy as np

%matplotlib inline
# data
apple_products = pd.DataFrame({"product":["iPhone","iPad","Mac","Services","Other"],
                              "Earnings(M dollars)":[141319,19222,25859,29980,12863],
                             "unit":[216756,43753,19251,np.nan,np.nan]})

#Value format
format_dict = {'Earnings(M dollars)':'{0:,.0f}', 'unit':'{0:,.0f}'}

#Display while setting the graph
(apple_products
 .style
 .format(format_dict)
 .hide_index()
 .bar(color="#99ceff", vmin=0, subset=['Earnings(M dollars)'], align='zero')
 .bar(color="#ff999b", vmin=0, subset=['unit'], align='zero'))

スクリーンショット 2019-12-03 21.55.48.png

15. Stacked graphs are useful for comparing percentages

picture_pc_ef8b0b193e70633750354d6f86d444df.png

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


#For error avoidance in pandas
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

#Font settings
plt.rcParams['font.family'] = 'Hiragino Sans'  
plt.rcParams['font.weight'] = 'heavy'

#data
music_env_df = pd.DataFrame({"radio":[0.4,0.21],"CD / Download":[0.22,0.44],"Video distribution":[0.2,0.3],"Music distribution":[0.18,0.05]},
            index=["GLOBAL","JAPAN"])
#For stacked graphs
music_env_df.T.cumsum()
#Graph color
bar_colors = ["#3B7780","#98C550","#7FC2CB","#E9C645"]

#get x ticklabel
x = music_env_df.index
#Each item name in the graph
keys = music_env_df.keys()

fig,ax = plt.subplots(figsize=(7,7))

# 1.Erase the left and right frames
sides = ['left','right']
[ax.spines[side].set_visible(False) for side in sides] 

# 2.Left axis memory, memory label deleted
ax.tick_params(left=False, labelleft=False)

# 3.Change the color of the upper and lower borders
ax.spines['bottom'].set_color("dimgray")
ax.spines['top'].set_color("dimgray")

# 4.x-axis memory settings
ax.tick_params(axis='x', labelsize='x-large',color="dimgray",labelcolor="dimgray")

# 5.Plot stacked graphs and store plot information
bar_info = []
for i in range(len(keys)):
    if i == 0:
        bar_info.append(ax.bar(x, music_env_df.T.iloc[i],width=0.5,color=bar_colors[i]))
    else:
        bar_info.append(ax.bar(x, music_env_df.T.iloc[i], bottom=music_env_df.T.cumsum().iloc[i-1],width=0.5,color=bar_colors[i]))

# 6.Items of each graph
for i,one in enumerate(bar_info):
    # %Store the number of
    bar_center = [[0,0],[0,0]]
    #Stores line coordinates that emphasize differences between bar charts
    bar_line = [[0,0],[0,0]]
    for j,one_bar in enumerate(one):
        bar_center[j][0] =  one_bar.xy[0]+one_bar.get_width()/2
        bar_center[j][1] =  one_bar.xy[1]+one_bar.get_height()/2
        #Display item name
        if j == 0:    
            ax.annotate(keys[i],xy=(0,0),xycoords="data",
                       xytext=(-0.4,bar_center[j][1]),
                       ha='right',color=bar_colors[i],fontsize=16)
            bar_line[j][0] = one_bar.xy[0]+one_bar.get_width()
            bar_line[j][1] = one_bar.xy[1]
        else:
            bar_line[j][0] = one_bar.xy[0] - bar_line[0][0]
            bar_line[j][1] = one_bar.xy[1] - bar_line[0][1]
        #Display the percentage value (%)
        ax.annotate(f'{one_bar.get_height():.0%}',xy=(0,0),xycoords="data",
                   xytext=(bar_center[j][0],bar_center[j][1]),
                   ha="center",va="center",color="white",fontsize=16)
        #Show highlighted lines
        ax.arrow(bar_line[0][0],bar_line[0][1], bar_line[1][0], bar_line[1][1], head_width=0, head_length=0, ec='dimgray')
        
# 7.Set the vertical axis area
ax.set_ylim(0,1)
       

tmp9.png

Extra edition # 1

DjHmYHNUcAE0dto.jpeg

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


#For error avoidance in pandas
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

#Font settings
plt.rcParams['font.family'] = 'Hiragino Sans'  
plt.rcParams['font.weight'] = 'heavy'

#data
icing_method = pd.DataFrame([0.35, 0.19, 0.13,0.05,0.03,0.02],
                           index=['Ice bath 2 ℃', 'Ice bath 8 ℃', 'Watering+Ice massage','Watering','Fan','Vein icing'],
                           columns=['Cooling speed'])

icing_detail = ["Immerse the whole body in an ice bath at 2 ℃","Immerse the whole body in an ice bath at 8 ℃","12 ℃ watering+Ice massage",
                "Continue to apply tap water at 15 ℃ to the whole body","It hits the wind of an electric fan at room temperature of 22 ° C.","(Neck / Axilla / Inguinal)"]

#Only the color of the bar is specified as the original
ori_blue = "#71C0F9"

fig, ax = plt.subplots(figsize=(12, 6))

icing_method.plot.barh(legend=False, ax=ax, width=0.8,color=ori_blue)

# 1.Title setting
plt.title("Cooling method and cooling speed",fontsize=24,fontweight='bold',color="dimgray")

# 2.Make a lot of margin on the left
plt.subplots_adjust(left=0.35)

# 3.Reverse the order of the y-axis
ax.invert_yaxis()

# 4.Erase other than the left frame
sides = ['right', 'top', 'bottom']
[ax.spines[side].set_visible(False) for side in sides] 

# 5.y-axis x-axis tick,Erase the y-axis tick label
ax.tick_params(bottom=False, left=False,labelleft=False)

# 6.x-axis value label setting
ax.set_xticks([i*0.1 for i in range(5)])
ax.tick_params(axis='x', labelcolor="silver")

# 7.x-axis range setting (0.Up to 4 x=0.Because the grid of 4 does not come out)
ax.set_xlim(0,0.41)

# 8.x-axis grid settings
ax.grid(axis="x")

# 9.x-axis label setting
ax.set_xlabel("Body temperature drops per 10 seconds (℃)",fontsize="x-large",fontweight="bold",color="silver")

# 10.Show actual values on the right side of the bar, items and supplementary descriptions on the right side
vmax = icing_method['Cooling speed'].max()
for i, (value,main_label,sub_label) in enumerate(zip(icing_method['Cooling speed'],icing_method.index,icing_detail)):
    ax.text(value+vmax*0.02, i, f'{value:,} ℃', fontsize='x-large', va='center', color=ori_blue)
    ax.text(-0.01, i-0.1,main_label  , fontsize='xx-large', va='center',ha='right',color="dimgray")
    ax.text(-0.01,i+0.25, sub_label, fontsize='x-large' ,va='center',ha='right', color="silver")

tmp12.png

Recommended Posts

Implement "Data Visualization Design # 3" with pandas and matplotlib
Implement "Data Visualization Design # 2" with matplotlib
Data visualization with pandas
Data visualization method using matplotlib (+ pandas) (5)
Versatile data plotting with pandas + matplotlib
Data visualization method using matplotlib (+ pandas) (3)
Data visualization method using matplotlib (+ pandas) (4)
Separation of design and data in matplotlib
Read Python csv data with Pandas ⇒ Graph with Matplotlib
Analyze Apache access logs with Pandas and Matplotlib
Interactively visualize data with TreasureData, Pandas and Jupyter.
Data manipulation with Pandas!
Shuffle data with pandas
Analysis of financial data by pandas and its visualization (2)
Analysis of financial data by pandas and its visualization (1)
Overview and tips of seaborn with statistical data visualization
Graph Excel data with matplotlib (1)
Graph Excel data with matplotlib (2)
Python application: data visualization # 2: matplotlib
Data visualization method using matplotlib (1)
Data visualization method using matplotlib (2)
Data processing tips with Pandas
Graph time series data in Python using pandas and matplotlib
Easy data visualization with Python seaborn.
Data analysis starting with python (data visualization 1)
Data analysis starting with python (data visualization 2)
Data pipeline construction with Python and Luigi
Starbucks Twitter Data Location Visualization and Analysis
Import of japandas with pandas 1.0 and above
Try converting to tidy data with pandas
Draw hierarchical axis labels with matplotlib + pandas
Sensor data acquisition and visualization for plant growth with Intel Edison and Python
Recommendation of Altair! Data visualization with Python
Graph trigonometric functions with numpy and matplotlib
Join data with main key (required) and subkey (optional) in Python pandas
Implement a model with state and behavior
Load csv with pandas and play with Index
Working with 3D data structures in pandas
Read CSV and analyze with Pandas and Seaborn
Example of efficient data processing with PANDAS
Plot multiple maps and data at the same time with Python's matplotlib
Best practices for messing with data with pandas
Format and display time series data with different scales and units with Python or Matplotlib
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 1
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 2
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
Visualize corona infection data in Tokyo with matplotlib
Pandas basics for beginners ③ Histogram creation with matplotlib
Try to aggregate doujin music data with pandas
Generate and post dummy image data with Django
Install pip and pandas with Ubuntu or VScode
Make holiday data into a data frame with pandas
"Learning word2vec" and "Visualization with Tensorboard" on Colaboratory
[# 1] Make Minecraft with Python. ~ Preliminary research and design ~
Read pandas data
Animation with matplotlib
Japanese with matplotlib
Animation with matplotlib
Histogram with matplotlib
Animate with matplotlib
Get Amazon RDS (PostgreSQL) data using SQL with pandas