[PYTHON] [matplotlib] Graphing sample function for scientific papers

0. Introduction

This article is a general-purpose function for creating graphs using python's matplotlib and a brief explanation of it. In order to increase versatility, we have devised ways to ** respond to the increase in legends with iterative functions as much as possible **. Also, since pandas.dataframe, which is often used in data processing, is used, it is easy to use it as it is for data after calculation.

The graph style of the format that the author needed is described. (I will add it regularly) This article is intended for research presentations and dissertation submissions by science and engineering engineers, assuming that the results of many parameters are dropped and graphs are viewed side by side.

1. Preparation

1.1. Environment construction

As of 04/04/2020, the latest python3-Anaconda environment is assumed. There are many ways to install python on the net, but Anaconda is recommended because you only have to click it. Check the path so that it passes.

The editor uses Visual Studio Code. I used to use PyCharm, but I changed it because I wanted to use the same editor as other languages. The automatic formatting command is convenient. (Shift + Alt + F) Reference: Running python3 with VS code (windows10)

1.2. Importing packages

Import what you need. Other than matplotlib, I use numpy and pandas for the time being. matplotlib uses only pyplot, right?

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

1.3. Preparation of sample data

We will prepare sample data for use in this article. The dataframe is easier to handle as data because it can be tabulated at once, but when actually making a graph, the list can be set more finely, so it may be returned to the list one by one. The handling of dataframes and reading and writing of files will be summarized somewhere again.

For the time being, I created a list and then generated df. Since it is difficult to see unless it is vertical when outputting csv or excel, rows and columns are converted.

A = np.linspace(0, 1)
B = [x**2 for x in A]
C = [np.cos(2*np.pi*x) for x in A]
S = [np.sin(2*np.pi*x) for x in A]
E = np.random.rand(len(A))/20+0.1
df = pd.DataFrame([A, B, C, S, E], index=[
                  "x", "squarex", "cosx", "sinx", "error"]).T

1.4. Graph style adjustment

Set to make it look like a dissertation. You can change the default value with `` `rcParams```, so change it as follows. Especially if you change the font, scale, legend frame, etc., it looks like that. For English figures, the font should be Times New Roman for the time being.

The following is a plot of the sample data prepared above after confirming the dataframe. `df.plot ()` makes a graph of multiple legends with index as the x-axis. It's surprisingly easy, but I won't use it anymore because it's a little tricky to adjust.

plt.rcParams['font.family'] = 'Times New Roman'
plt.rcParams['font.size'] = 10  #To the size you need
plt.rcParams['xtick.direction'] = 'in'  # in or out
plt.rcParams['ytick.direction'] = 'in'
plt.rcParams['axes.xmargin'] = 0.01
plt.rcParams['axes.ymargin'] = 0.01
plt.rcParams["legend.fancybox"] = False  #Round angle OFF
plt.rcParams["legend.framealpha"] = 1  #Specify transparency, no fill with 0
plt.rcParams["legend.edgecolor"] = 'black'  #Change edge color

df = df.set_index("x")
df.plot()
plt.legend()
plt.show()

The output result is as follows. Figure_1.png

1.5. About LaTeX notation

In python, you can easily write mathematical formulas in LaTeX notation with r'$ ~~ $' (described in ~~). When displaying in a graph, I feel that it is often taken care of when I want to use superscripts, subscripts, and Greek letters in labels rather than actively writing mathematical formulas. See Official for notation. The default math font is dejavusans, but since the sans serif body is unpleasant, it is recommended to change it to stix as follows.

plt.rcParams["mathtext.fontset"] = "stix" #Make it a stix font

As a caveat, if you leave it as it is, the formula will be written in fine print (or rather standard thickness). All matplotlib defaults are in bold, which makes it a bit uncomfortable in terms of design. If you are worried about it, you may try your best at here to make everything finer.

Finally, if you want to use mathematical formulas literally, it may be troublesome if the default is fine italics, so by doing the following, you can apply it in the same state as ordinary Text.

plt.rcParams['mathtext.default']= 'default' #Make it the default text

With the settings up to this point, the superscript is too high and the subscript is too low, so I think you should correct it. Reference: Solving the problem that superscripts are too high and subscripts are too low in matplotlib

1.6. Precautions

After that, in this article, I will draw a graph with an object-oriented interface. It's called ax. ~~. Please see the person who explains the theory properly. Reference: Basic knowledge of matplotlib that I wanted to know early, or the story of an artist who can adjust the appearance

2. Multiple legends (1 axis)

2.1. Overview

2.2. Functions

multiLegend


def multiLegend(df, x, y_list):
    c_list = ["k", "r", "b", "g", "c", "m", "y"]
    l_list = ["-","--","-.","."]
    fig, ax = plt.subplots(figsize=(5, 5))
    plt.subplots_adjust(top=0.95, right=0.95)
    for i in range(len(y_list)):
        y = y_list[i]
        ax.plot(df[x], df[y], linestyle=l_list[i], color=c_list[i], label=y)
    yLabel = ', '.join(y_list)
    ax.set_ylabel(yLabel)
    ax.set_xlabel(x)
    plt.legend()
    plt.show()
    return

2.3. Execution result

multiLegned(df, "x", ["squarex", "cosx", "sinx"])

Figure_3.png

3. Multiple legends (2 axes)

3.1. Overview

3.2. Functions

multiLegend2


def multiLegend2(df, x, y1_list, y2_list=None):
    # y2_If you do not enter list, it will not be 2 axes
    c_list = ["k", "r", "b", "g", "c", "m", "y"]
    l_list = ["-", "--", "-.", "."]
    fig, ax1 = plt.subplots(figsize=(5.5, 5))
    j = 0
    for y in y1_list:
        ax1.plot(df[x], df[y], linestyle=l_list[j],
                      color=c_list[j], label=y)
        j += 1
    ax1.legend(loc='lower left')
    ax1.set_xlabel(x)
    ax1.set_ylabel(', '.join(y1_list))
    if len(y2_list) != None:
        ax2 = ax1.twinx()
        for y in y2_list:
            ax2.plot(df[x], df[y], linestyle=l_list[j],
                         color=c_list[j], label=y)
            j += 1
        ax2.legend(loc='upper right')
        ax2.set_ylabel(', '.join(y2_list))
    plt.tight_layout()
    plt.show()
    return

3.3. Execution result

multiLegend2(df, "x", ["squarex", "cosx"], ["sinx"])

Figure_1.png

4. Multi-axis graph

4.1. Overview

4.2. Functions

multiAxes


def multiAxes(df, x, y_list):
    c_list = ["k", "r", "b", "g", "c", "m", "y"]
    l_list = ["-","--","-.","."]
    fig, ax0 = plt.subplots(figsize=(6, 5))
    plt.subplots_adjust(top=0.95, right=0.95-(len(y_list)-1)*0.1) #Adjust here if it shifts
    axes = [ax0]  #Make the number of axes for variables
    p_list = []  #Variable plot container
    for i in range(len(y_list)):
        y = y_list[i]
        if i != 0:
            axes.append(ax0.twinx())
            axes[i].spines["right"].set_position(("axes", 1+(i-1)*0.2)) #Adjust here if it shifts
        p, = axes[i].plot(df[x], df[y], linestyle=l_list[i], color=c_list[i], label=y)
        p_list.append(p)
        axes[i].set_ylabel(y_list[i], color=c_list[i])
        axes[i].yaxis.label.set_color(c_list[i])
        axes[i].spines['right'].set_color(c_list[i])
        axes[i].tick_params(axis='y', colors=c_list[i])
    axes[0].set_xlabel(x)
    plt.legend(p_list,y_list)
    plt.show()
    return

4.3. Execution result

multiAxes(df, "x", ["squarex", "cosx", "sinx"])

Figure_2.png

5. Multiple subplot

5.1. Overview

5.2. Functions

multiPlots


def multiPlots(df, x, y_list):
    c_list = ["k", "r", "b", "g", "c", "m", "y"]
    l_list = ["-","--","-.","."]
    fig, axes = plt.subplots(len(y_list), 1, sharex="all", figsize=(4, 2*len(y_list)))
    for i in range(len(y_list)):
        y = y_list[i]
        axes[i].plot(df[x], df[y], linestyle=l_list[i], color=c_list[i], label=y)
        axes[i].set_ylabel(y_list[i], color=c_list[i])
        axes[i].yaxis.label.set_color(c_list[i])
        axes[i].spines['left'].set_color(c_list[i])
        axes[i].tick_params(axis='y', colors=c_list[i])
        if i == len(y_list)-1:
            axes[i].set_xlabel(x)
    plt.tight_layout()
    plt.show()
    return

5.3. Execution result

multiPlots(df, "x", ["squarex", "cosx", "sinx"])

Figure_4.png

6. Multiple samples

6.1. Overview

6.2. Functions

multiData


def multiData(df_list, x, y1, y2=None):
    l_list = ["-", "--", "-.", "."]
    fig, ax1 = plt.subplots(figsize=(5.25, 5))
    p_list = []  #Variable plot container
    labels = []
    for i in range(len(df_list)):
        df = df_list[i]
        p, = ax1.plot(df[x], df[y1], linestyle=l_list[i], color="k")
        labels.append(y1+'-data'+str(i+1))
        p_list.append(p)
    ax1.set_xlabel(x)
    ax1.set_ylabel(y1)
    if y2 != None:
        ax2 = ax1.twinx()
        for i in range(len(df_list)):
            df = df_list[i]
            p, = ax2.plot(df[x], df[y2], linestyle=l_list[i], color="b")
            labels.append(y2+'-data'+str(i+1))
            p_list.append(p)
        ax2.set_ylabel(y2, color = "b")
        ax2.yaxis.label.set_color("b")
        ax2.spines['right'].set_color("b")
        ax2.tick_params(axis='y', colors="b")
    plt.legend(p_list, labels)
    plt.tight_layout()
    plt.show()
    return

6.3. Execution result

A_2 = np.linspace(1, 2)
B_2 = [x**2 for x in A_2]
C_2 = [np.cos(2*np.pi*x) for x in A_2]
df_2 = pd.DataFrame([A_2, B_2, C_2], index=["x", "squarex", "cosx"]).T
A_3 = np.linspace(2, 3)
B_3 = [x**2 for x in A_3]
C_3 = [np.cos(2*np.pi*x) for x in A_3]
df_3 = pd.DataFrame([A_3, B_3, C_3], index=["x", "squarex", "cosx"]).T

multiData([df, df_2, df_3], "x", "squarex", "cosx")

Figure_2.png

7. 1 axis with error

7.1. Overview

7.2. Functions

multiLegend_wError


def multiLegend_wError(df, x, y_list, y_error_list):
    c_list = ["k", "r", "b", "g", "c", "m", "y"]
    l_list = ["-", "--", "-.", "."]
    fig, ax = plt.subplots(figsize=(5, 5))
    plt.subplots_adjust(top=0.95, right=0.95)
    for i in range(len(y_list)):
        y = y_list[i]
        y_error = y_error_list[i]
        ax.plot(df[x], df[y], linestyle=l_list[i], color=c_list[i], label=y)
        ax.fill_between(df[x], df[y]+df[y_error], df[y]-df[y_error], facecolor=c_list[i], edgecolor=None, alpha=0.3)
    yLabel = ', '.join(y_list)
    ax.set_ylabel(yLabel)
    ax.set_xlabel(x)
    plt.legend()
    plt.show()
    return

7.3. Execution result

multiLegend_wError(df, "x", ["squarex", "cosx", "sinx"], ["error", "error", "error"])

Figure_1.png

8. Conclusion

In this article, I have summarized the graph output functions for science and technology papers that I actually used using matplotlib. When actually making a post version, each axis label is created by hand, but I also make a list and read it in the for statement. This time, I used the label name of the dataframe to avoid making it difficult to see.

We will update it regularly, so please take care.

Recommended Posts

[matplotlib] Graphing sample function for scientific papers
4 Techniques for Creating Diagrams for Papers and Presentation Materials with matplotlib