[PYTHON] 4 Techniques for Creating Diagrams for Papers and Presentation Materials with matplotlib

September 9, 2017 Notice of publication of "Introduction to Jupyter [Practice] for Python Users" (Added on 2017/08/29)

In connection with this, I co-authored the book "Jupyter [Practice] Introduction for Python Users".

Although not included in the title, it also explains in detail how to use pandas, Matplotlib, and Bokeh. What is written in this article is also written in a more understandable way. I wrote it with the aim of becoming a must-have book when using Jupyter, pandas, Matplotlib, and Bokeh, so please take a look.

"Introduction to Jupyter [Practice] for Python Users" (Takahiro Ikeuchi, Kaoruko Katayanagi, Emma Iwao, @driller, Gijutsu-Hyoronsha)


This article is the 13th day of Python Advent Calendar 2015 --Adventer.

Introduction

Since I started data analysis in Python, I have come to want to make diagrams for papers and presentation materials in Python. There are various drawing tools, but I came to the conclusion that matplotlib seems to be good as a simple graph used for printed matter such as papers, and I have tried various drawing with matplotlib. Today, I would like to introduce four techniques that I learned as a result of trial and error: 1) draw a bar graph, 2) specify colors in detail, 3) place the legend outside the frame, and 4) put the figure on the canvas. I think.

environment

Windows7 + Anaconda(Python3.5)

Create demo dataset

This time, I made a demo using numerical values that are normally distributed. Use Numpy's random.normal that can specify the mean value, standard deviation, and the number of data to be generated.

python


import numpy as np  #For demo data generation. It has nothing to do with the main subject.
import matplotlib.pyplot as plt  #2D plotting library at the heart of today's topic
import matplotlib.cm as cm  #Class for specifying the color used for the graph in detail
from IPython.display import Image  #Class for importing diagrams into notebooks

plt.rcParams['font.size'] = 14 #Set font size

x = np.array(range(1, 25))
y1 = np.random.normal(20, 5, 24)
y2 = np.random.normal(30, 5, 24)
y3 = np.random.normal(40, 5, 24)
y4 = np.random.normal(50, 5, 24)
y5 = np.random.normal(60, 5, 24)

__Technique 1 What to do when writing a bar chart __ If you simply draw numerical values on one figure one after another with matplotlib, the bar drawn earlier will be overwritten by the bar drawn later. Therefore, when drawing n values (V1, V2, V3, ..., Vn) in layers, V1 + V2 + ... + Vn, V1 + V2 + ... + Vn-1, .. It is necessary to create .. and integrated values and draw in order from the largest. There are many cooler ways to write, but if you write it simply, you will create a drawing dataset with the following feeling.

#Create a dataset for a bar chart
dataset = {'dat1':(y1+y2+y3+y4+y5), 
           'dat2':(y2+y3+y4+y5), 
           'dat3':(y3+y4+y5), 
           'dat4':(y4+y5), 
           'dat5':y5}

__Technique 2 Specifying a fine color set __ If you write only the basic red (r), green (g), and blue (b), the graph will look a little sloppy, so use the cm class of matplotlib to make it look a little more fashionable. See here for more color information.

This time, I used the colors extracted from several color scales.

#Color set creation
colors = [cm.RdBu(0.85), cm.RdBu(0.7), cm.PiYG(0.7), cm.Spectral(0.38), cm.Spectral(0.25)]

Graph drawing

Now it's time to start drawing. In this demo, I created a function to change the conditions and execute it repeatedly.

 #Creating a drawing function
def plot(bbax, bbay, bap, adj, adjl, adjr):
    
    fig, ax1 = plt.subplots(1, 1, figsize=(12, 5))

    ax1.bar(x, dataset['dat1'], color=colors[0], edgecolor='w', align='center', label='Data1')
    ax1.bar(x, dataset['dat2'], color=colors[1], edgecolor='w', align='center', label='Data2')
    ax1.bar(x, dataset['dat3'], color=colors[2], edgecolor='w', align='center', label='Data3')
    ax1.bar(x, dataset['dat4'], color=colors[3], edgecolor='w', align='center', label='Data4')
    ax1.bar(x, dataset['dat5'], color=colors[4], edgecolor='w', align='center', label='Data5')
    
    #Legend drawing(Position designation)
    if bap == 999:
        ax1.legend(bbox_to_anchor=(bbax, bbay))
    else:
        ax1.legend(bbox_to_anchor=(bbax, bbay), borderaxespad=bap)

    #Size adjustment
    if adj != 0:
        plt.subplots_adjust(left = adjl, right = adjr)

    #File output
    fname = 'fig' + str(bbax) + str(bbay) + str(bap) + str(adj) + str(adjl) + str(adjr) + '.png'
    plt.savefig(fname, dpi = 300, format='png')

__Technique 3 Legend Positioning __ The position of the legend is You can also put it in place, but here we take the method of positioning by specifying the anchor position. First, if you do not specify the position, the legend will appear in the upper right corner as shown in the figure below. Fig. 1
Fig. 1 Default legend layout

Therefore, the anchor of the legend is specified by bbox_to_anchor. With this code,

    ax1.legend(bbox_to_anchor=(bbax, bbay))

(In this code, if you do not specify the border axes pad described later, 999 is specified). If you specify the anchor position as (0, 0), (0, 1), (1, 0), (1, 1) and draw, the legend will be arranged in the following positional relationship.

Fig. 2
  Fig. 2 Schematic diagram of legend layout

In other words, it specifies where the upper right corner of the legend is in the border of the figure. By the way, there is a subtle gap between the border and the legend. This seems to have a slight gap in the default settings. Use the borderaxepad option to adjust this gap.

Of the code

borderaxespad=bap

It is the part of. Setting bap = 0 eliminates the subtle gaps in Fig.2. The schematic diagram shows the following positional relationship.

Fig. 3

Fig. 3 Arrangement of legend when borderaxepad = 0 is set

Since I specified baq = 0 this time, the corner of the border and the corner of the legend overlap exactly, but you can adjust the distance from the border by changing the value. Increasing the number will move away from the anchor position by the specified amount.

Now that you have grasped the positional relationship between the legend and the border of the figure, the figure below shows the location information.

graph_anchor.png Fig. 4 Position information of the graph

That is, the position of the left border is x = 0, the position of the right border is x = 1, the bottom is y = 0, and the top is y = 1. In other words, if you write on the right outside of the border, you can make the value of x larger than 1.

Now it's time to draw the legend on the outside. This time, I set bbox_to_anchor = (1.2, 1) to bring it to the right outside.

Now you can write the legend outside. fig1.210000.png Fig. 5 The figure with the legend placed on the outside

__Technique 4 Placement on canvas __ I can understand the positional relationship, I can arrange the legend perfectly, and I feel that I am happy with this, but of course it does not end here. As you can see in Fig. 5, the legend is cut in half. If you output the graph to a file, it will look like this.

This is because the graph + legend is too large for the canvas on which the graph + legend is drawn. To put it simply, the edge of the canvas is located at the blue line in the figure below.

Fig. 6 Fig. 6 Figure with legend placed on the outside (with border)

Use the subplots_adjust option to adjust this. The following part of the code.

 plt.subplots_adjust(left = adjl, right = adjr)

This time, `` `subplots_adjust``` is applied when adj is non-zero.

`subplots_adjust``` is also set based on the location information in Fig. 4. In other words, specify where to place the left and right borders of the figure on the canvas. As in the figure, the canvas has zero on the left edge and 1 on the right edge. If subplot_adust = (0, 1) ``, the left edge will be at the zero position on the canvas and the right edge will be at the 1 position. In the figure, the left edge is always to the left of the right edge, so the value at the left edge cannot be greater than the value at the right edge.

So, set the previous figure so that it fits on the canvas. This time I set `` `subplots_adjust = (0.1, 0.8) ```. The output result is as follows.

Fig. 7 Fig.7 Figure with subplots_adjust (with border)

It fits nicely.

in conclusion

How about 4 techniques for making diagrams for papers and presentation materials with matplotlib? The Jupyter notebook I made for this post will be uploaded to GitHub later, so if you are interested, please take a look there as well.

2015/12/13 22:18 Jupyter notebook has been uploaded to GitHub. https://github.com/nobolis/pydata_workbook/blob/master/matplotlib_tips/matplotlib_tips.ipynb

Recommended Posts

4 Techniques for Creating Diagrams for Papers and Presentation Materials with matplotlib
Plot black and white graphs suitable for papers with matplotlib or pylab
Getting Started with Drawing with matplotlib: Creating Diagrams from Data Files
[Python] font family and font with matplotlib
Commands for creating SNS with Django
Explanation of creating an application for displaying images and drawing with Python
Japanese settings for matplotlib and Seaborn axes
selenium: wait for element with AND / OR
Heat Map for Grid Search with Matplotlib
[matplotlib] Graphing sample function for scientific papers
Graph trigonometric functions with numpy and matplotlib
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib