[PYTHON] Separation of design and data in matplotlib

Overview

Task

I often draw graphs from ** Python ** using ** matplotlib **. Data is generated by ** Python ** application, formatted, and graph output by ** matplotlib **.

At that time, I had the following issues.

Cause

The code you write to ** visualize the data ** and the code you write to ** manage the design ** I thought it was because it was written in one application.

Correspondence

I tried to separate the code for ** managing the design ** as a ** configuration file **.

Details

Of the three issues listed, the graph design is not unified is It means that the ** label size **, ** legend **, ** plotted points **, etc. assigned to the vertical and horizontal axes are not unified.

This is also related to the second task, because I made a ** similar description ** in another Python file.

The problem was that it wasn't ** unified **, and I was doing ** copy and paste ** by referring to past Python files on the spot.

The reason why I repeat ** copy ** is that the ** design ** and ** code ** I'm looking for I think it's because it's not tied, at least not intuitive. This leads to a third challenge.

I thought the reason for these three challenges was that the ** data ** and ** design ** were not ** separated **.

If so, the story is simple: separate the ** design ** into a ** config file **.

Express your design in words in a configuration file

I tried to correspond the design and the word as shown in the table below.

The meaning of each item is

--Design classification --Configuration file parameters --Meaning of parameters --Corresponding matplotlib code

is.

-** Size (set the size) ** - figure_x --Horizontal size of the figure - pyplot.figure(figsize=(figure_x, figure_y)) - figure_y --The vertical size of the figure --Same as figure_x - font_title --Title font size - pyplot.title(fontsize=font_title) - font_x_label --X-axis font size - pyplot.figure.add_subplot().ax.set_xlabel(fontsize=font_x_label) - font_y_label --Y-axis font size - pyplot.figure.add_subplot().ax.set_ylabel(fontsize=font_y_label) - font_tick --Axis memory font size - pyplot.tick_params(labelsize=font_tick) - font_legend --Legend font size - pyplot.legend(fontsize=font_legend) - marker --Marker size when plotting data - pyplot.figure.add_subplot().ax.plot(markersize=marker)

-** Position ** - subplot --Where to place the graph - pyplot.figure.add_subplot(subplot) - legend_location --Location in the graph to place the legend - pyplot.legend(loc=legend_location)

How to express a specific configuration file

For the configuration file, I considered several expression methods, such as json and yaml.

As a result, I decided to use the standard ** [configparser] 1 ** for Python.

I think json and yaml are fine, but I thought it would be better to ** intuitively use ** than to express it hierarchically.

The configuration file is represented by ** configparser ** as follows.

config.ini


[Size]
figure_x=8
figure_y=8
font_title=20
font_x_label=18
font_y_label=18
font_tick=10
font_legend=15
marker=10

[Position]
subplot=111
legend_location=upper right

[Markers]
0=D
1=>
2=.
3=+
4=|

[Color]
0=red
1=blue
2=green
3=black
4=yellow

Items that correspond to ** design classification ** are enclosed in ** square brackets **, such as ** [Size] **. This is called the ** section **. Below that, write ** parameter = value ** of the configuration file. This is called the ** key **.

In addition to ** Size ** and ** Position **, the above configuration files also include ** Markers (types of markers to plot) ** and ** Color (colors of markers and lines to be plotted) **. It is expressed.

Use the parameters described in the configuration file from the Python code

To access the parameters in the configuration file, write the code as follows:


import configparser


rule_file = configparser.ConfigParser()
rule_file.read("config file path", "UTF-8")

hogehoge = rule_file["Section name"]["Key name"]

Note that the read value will be a ** string **.

Actual usage example

The code below creates a line chart based on the ** passed data ** and ** design configuration file **.

make_line_graph.py


"""Line graph creation function

Draw a line graph using the passed data and save it as image data.

"""


import configparser
import matplotlib.pyplot as plt


def make_line_graph(data, config="config.ini"):
    """Line graph drawing

Create a line graph using the passed data.
Read the design from another config file.

    Args:
        data(dict):Contains data to plot
        config(str):The name of the config file

    Returns:
        bool:If True, creation is complete, if False, creation fails

    Note:
The key and value that should be included in the argument data are described below.

        key         : value
        ------------------------
        title(str):Graph title name
        label(list):Legend description
        x_data(list):x-axis data
        y_data(list):y-axis data
        x_ticks(list):Value to display in x-axis memory
        y_ticks(list):Value to display in y-axis memory
        x_label(str):x-axis name
        y_label(str):y-axis name
        save_dir(str):Save file path
        save_name(str):Save file name
        file_type(str):Save file format
    """

    rule_file = configparser.ConfigParser()
    rule_file.read("./conf/{0}".format(config), "UTF-8")

    fig = plt.figure(figsize=(int(rule_file["Size"]["figure_x"]), int(rule_file["Size"]["figure_y"])))

    ax = fig.add_subplot(int(rule_file["Position"]["subplot"]))
    ax.set_xlabel(data["x_label"], fontsize=int(rule_file["Size"]["font_x_label"]))
    ax.set_ylabel(data["y_label"], fontsize=int(rule_file["Size"]["font_y_label"]))

    for index in range(len(data["x_data"])):
        ax.plot(data["x_data"][index],
                data["y_data"][index],
                label=data["label"][index],
                color=rule_file["Color"][str(index)],
                marker=rule_file["Markers"][str(index)],
                markersize=int(rule_file["Size"]["marker"]))

    plt.title(data["title"], fontsize=int(rule_file["Size"]["font_title"]))

    if "x_ticks" in data.keys():
        plt.xticks(data["x_ticks"][0], data["x_ticks"][1])

    if "y_ticks" in data.keys():
        plt.yticks(data["y_ticks"][0], data["y_ticks"][1])

    plt.tick_params(labelsize=int(rule_file["Size"]["font_tick"]))
    plt.legend(fontsize=rule_file["Size"]["font_legend"], loc=rule_file["Position"]["legend_location"])
    plt.savefig("".join([data["save_dir"], "/", data["save_name"], ".", data["file_type"]]))

The Python file that passes the data looks like this:

main.py



from make_line_graph import make_line_graph

data = {
    "title": "hogehoge",
    "label": ["A", "B"],
    "x_data": [x_data1, x_data2],
    "y_data": [y_data1, y_data2],
    "x_ticks": [x_ticks1, x_ticks2],
    "y_ticks": [y_ticks1, y_ticks2],
    "x_label": "hogehoge",
    "y_label": "hogehoge",
    "save_dir": "Path of the folder you want to save",
    "save_name": "File name you want to save",
    "file_type": "extension",
}

make_line_graph(data, config="config.ini")

Impressions I tried using

good point

The design is easier to change. In particular, various font sizes vary depending on the data to be plotted and the number of characters to be included in the label.

Also, by duplicating and customizing the configuration file, the amount of Python file changes when you want to change the graph design has been reduced. You only have to change the name of the configuration file to be read.

Since the graph design and the configuration file are tied together, it's okay to forget which design corresponds to which code.

Bad point

It is difficult to make it versatile.

The ** make_line_graph.py ** I made is a line graph creation function, but I don't want to see more similar Python files, so I made it as versatile as possible. However, this does not draw the graph well, and if another line graph creation function is crowded to correspond to it, it seems that it will return to the drawing.

I'm wondering if there is no end to it considering its versatility.

Recommended Posts

Separation of design and data in matplotlib
Implement "Data Visualization Design # 3" with pandas and matplotlib
Full-width and half-width processing of CSV data in Python
Graph time series data in Python using pandas and matplotlib
Installation of SciPy and matplotlib (Python)
Hashing data in R and Python
Design of experiments and combinatorial optimization
Implement "Data Visualization Design # 2" with matplotlib
PyOpenGL GUI selection and separation of drawing and GUI
Summary of frequently used commands in matplotlib
Easily graph data in shell and Python
Screenshots of Megalodon in selenium and Chrome.
Conversion of time data in 25 o'clock notation
Summary of modules and classes in Python-TensorFlow2-
Project Euler # 1 "Multiples of 3 and 5" in Python
Visualization of latitude / longitude coordinate data (assuming meteorological data) using cartopy and matplotlib
Look up the names and data of free variables in function objects
Plot CSV of time series data with unixtime value in Python (matplotlib)
Summary of probability distributions that often appear in statistics and data analysis
Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data
Visualize corona infection data in Tokyo with matplotlib
Summary of OSS tools and libraries created in 2016
Python variables and data types learned in chemoinformatics
Multivariate LSTM and data preprocessing in TensorFlow 2.x
[Python] Swapping rows and columns in Numpy data
Real-time visualization of thermography AMG8833 data in Python
Smoothing of time series and waveform data 3 methods (smoothing)
Header shifts in read_csv () and read_table () of Pandas
When the axis and label overlap in matplotlib
Data Science Workloads and RTVS in Visual Studio 2017
The story of reading HSPICE data in Python
Put the second axis in 2dhistgram of matplotlib
Coexistence of Anaconda 2 and Anaconda 3 in Jupyter + Bonus (Julia)
Data cleansing 3 Use of OpenCV and preprocessing of image data
A well-prepared record of data analysis in Python
Explanation of edit distance and implementation in Python
I created a stacked bar graph with matplotlib in Python and added a data label
"Linear regression" and "Probabilistic version of linear regression" in Python "Bayesian linear regression"
Analysis of financial data by pandas and its visualization (2)
About import error of numpy and scipy in anaconda
List of Python libraries for data scientists and data engineers
Analysis of financial data by pandas and its visualization (1)
Separation of Japanese surname and given name with BERT
Omit the decimal point of the graph scale in matplotlib
Difference between Ruby and Python in terms of variables
Statistical hypothesis test of A/B test and required number of data
Overview and tips of seaborn with statistical data visualization
[python] Calculation of months and years of difference in datetime
Performance verification of data preprocessing in natural language processing
Not being aware of the contents of the data in python
List of Python code used in big data analysis
Approximately 200 latitude and longitude data for hospitals in Tokyo
Let's use the open data of "Mamebus" in Python
Analysis of measurement data ②-Histogram and fitting, lmfit recommendation-
Visualization method of data by explanatory variable and objective variable
Overview of generalized linear models and implementation in Python
Sample of getting module name and class name in Python
Overview of natural language processing and its data preprocessing
Summary of date processing in Python (datetime and dateutil)
Put matplotlib in Centos7.
Numerical summary of data