[PYTHON] matplotlib Write text to time series graph

Introduction

I made a time series graph, but I stumbled on the method of displaying text at the specified position in it, and I checked it, so I will upload it. Our environment is as follows. The work is done on the Juputer notebook.

Example

fig_test.png

Where I devised

In the graph shown in the example, the missing period of the flow rate data is filled in gray and the period is displayed in text. The text display of this missing period is the highlight of this post.

Regarding the axis setting of the time series graph, I follow the following article I posted earlier.

https://qiita.com/damyarou/items/19f19658b618fd05b3b6

How to display text on a time series graph

Create a character string list showing the start and end points of the missing period and convert it to datetime type

    #start of no discharge data (start of missing period)
    _sss=['2014-10-29',
          '2014-12-01',
          '2017-08-01',
          '2018-01-01',
          '2018-06-01',
          '2019-06-01']
    #end of no discharge data
    _sse=['2014-10-31',
          '2014-12-31',
          '2017-12-31',
          '2018-04-30',
          '2018-08-31',
          '2019-12-31']
    sss=[]
    sse=[]
    for ss,se in zip(_sss,_sse):
        ss = datetime.datetime.strptime(ss, '%Y-%m-%d')
        se = datetime.datetime.strptime(se, '%Y-%m-%d')
        sss=sss+[ss]
        sse=sse+[se]

While turning the loop for each missing period, fill the necessary parts with gray and display the text

        for ss,se in zip(sss,sse):
            xx=dates.date2num(ss)+(dates.date2num(se)-dates.date2num(ss))/2 #Specify the center x coordinate of the text display position
            x1=dates.date2num(xmin) #Graph x coordinate start point
            x2=dates.date2num(xmax) #The x-coordinate end point of the graph
            if x1<xx<x2: #Text drawing if the text display position is between the start and end points of the graph
                plt.axvspan(ss,se,color='#cccccc')
                xs=dates.num2date(xx)
                ys=ymin+0.3*(ymax-ymin)
                tstr1 = ss.strftime('%Y/%m/%d')
                tstr2 = se.strftime('%Y/%m/%d')
                sstr=tstr1+'~'+tstr2+'\nno discharge data'        
                plt.text(xs,ys,sstr,ha='center',va='bottom',fontsize=fsz,rotation=90,linespacing=1.5)

Filling in gray is done with axvspan.

To specify the text display position, set import matplotlib.dates as dates and `dates.date2num ()` (January 1, 1st year, midnight + 1st) I'm using functions (convert days to floating point) and dates.num2date ()` `` (convert floating point to` `datetime).

In the program, similar diagrams are created every year for multiple years, so the process of "checking the start and end points of the x-axis for each graph and drawing only when the text display position is in between" If you don't do this, the text will be displayed in a ridiculous place, so this is the way to do it.

The text is output with one `plt.text ()` over two lines. This line spacing is adjusted by `` `linespacing = 1.5```.

See below for functions.

I'm aware but I'm not doing

If you look closely at the graph, you can see that there is a blank space for one day before and after the missing period at the end of October and before the missing period in December. This is because the time of the date indicated by the data is midnight of the day, so the period from midnight to 24:00 of the day remains unfilled. Since the plot data is daily data, it should be filled with `bar```, but since it takes a long time to draw, it is okay to fill with `fill_between```.

Full program

# Time series drawing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import matplotlib.dates as dates
import matplotlib.dates as mdates


def inp_rf():
    fnameR='xls_rdata_20190630.xlsx'
    df = pd.read_excel(fnameR,index_col=0)
    df.index = pd.to_datetime(df.index, format='%Y/%m/%d')
    df_rf=df['2013/01/01':'2019/12/31']
    return df_rf


def inp_qq():
    fnameR='df_qq1.csv'
    df = pd.read_csv(fnameR,index_col=0)
    df.index = pd.to_datetime(df.index, format='%Y/%m/%d')
    df_qq=df['2013/01/01':'2019/12/31']
    return df_qq

    
def drawfig(df_rf,df_qq):
    # start of no discharge data
    _sss=['2014-10-29',
          '2014-12-01',
          '2017-08-01',
          '2018-01-01',
          '2018-06-01',
          '2019-06-01']
    # end of no discharge data
    _sse=['2014-10-31',
          '2014-12-31',
          '2017-12-31',
          '2018-04-30',
          '2018-08-31',
          '2019-12-31']
    sss=[]
    sse=[]
    for ss,se in zip(_sss,_sse):
        ss = datetime.datetime.strptime(ss, '%Y-%m-%d')
        se = datetime.datetime.strptime(se, '%Y-%m-%d')
        sss=sss+[ss]
        sse=sse+[se]
    yyyy=np.array([2013,2014,2015,2016,2017,2018,2019])
    fsz=12
    st='01-01'
    ed='12-31'
    for year in yyyy:
        sxmin=str(year)+'-'+st
        sxmax=str(year)+'-'+ed
        plt.figure(figsize=(16,6),facecolor='w')
        plt.rcParams['font.size']=fsz
        xmin = datetime.datetime.strptime(sxmin, '%Y-%m-%d')
        xmax = datetime.datetime.strptime(sxmax, '%Y-%m-%d')
        ymin=0
        ymax=400
        plt.xlim([xmin,xmax])
        plt.ylim([ymin,ymax])
        sxlabel='Date ({0})'.format(year)
        plt.xlabel(sxlabel)
        plt.ylabel('Daily discharge Q (m$^3$/s)')
        plt.grid(which='major',axis='both',color='#999999',linestyle='--')
        for ss,se in zip(sss,sse):
            xx=dates.date2num(ss)+(dates.date2num(se)-dates.date2num(ss))/2
            x1=dates.date2num(xmin)
            x2=dates.date2num(xmax)
            if x1<xx<x2:
                plt.axvspan(ss,se,color='#cccccc')
                xs=dates.num2date(xx)
                ys=ymin+0.3*(ymax-ymin)
                tstr1 = ss.strftime('%Y/%m/%d')
                tstr2 = se.strftime('%Y/%m/%d')
                sstr=tstr1+'~'+tstr2+'\nno discharge data'        
                plt.text(xs,ys,sstr,ha='center',va='bottom',fontsize=fsz,rotation=90,linespacing=1.5)

        plt.fill_between(df_qq.index,df_qq['q_tot'],0,color='#ff00ff',label='Q (total)')
        plt.fill_between(df_qq.index,df_qq['q_lll']+df_qq['q_rrr'],0,color='#00ff00',label='Q (Right)')
        plt.fill_between(df_qq.index,df_qq['q_lll'],0,color='#ff0000',label='Q (Left)')
        plt.twinx()
        plt.ylim([ymax,ymin])
        plt.ylabel('Daily rainfall RF (mm/day)')
        plt.fill_between(df_rf.index,df_rf['RF'],0,color='#0000ff',label='RF in basin by JWA')
        plt.fill_between([0],[0],0,color='#ff00ff',label='Q (total)')
        plt.fill_between([0],[0],0,color='#00ff00',label='Q (Right)')
        plt.fill_between([0],[0],0,color='#ff0000',label='Q (Left)')
        plt.legend(bbox_to_anchor=(1, 1.01), loc='lower right', borderaxespad=0.1, ncol=4, shadow=True, fontsize=fsz-2)
    
        #plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%b-%Y'))
        plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%b'))
        plt.gca().xaxis.set_major_locator(mdates.MonthLocator(interval=1))
        plt.gca().xaxis.set_minor_locator(mdates.MonthLocator(interval=1))
        plt.gcf().autofmt_xdate()

        fnameF='fig_'+str(year)+'.png'
        plt.savefig(fnameF, dpi=100, bbox_inches="tight", pad_inches=0.1)
        plt.show()


def main():
    df_rf=inp_rf()
    df_qq=inp_qq()
    drawfig(df_rf,df_qq)
    

#==============
# Execution
#==============
if __name__ == '__main__': main()

that's all

Recommended Posts

matplotlib Write text to time series graph
Time series plot / Matplotlib
Graph time series data in Python using pandas and matplotlib
[Graph drawing] I tried to write a bar graph of multiple series with matplotlib and seaborn
How to compare time series data-Derivative DTW, DTW-
How to draw a graph using Matplotlib
How to handle time series data (implementation)
How to avoid writing% matplotlib inline every time
Time Series Decomposition
How to read time series data in PyTorch
matplotlib graph album
[Python] How to draw a line graph with Matplotlib
I tried to implement time series prediction with GBDT
Python: Time Series Analysis
Graph drawing using matplotlib
I have read 10 books related to time series data, so I will write a book review.
Python time series question
RNN_LSTM1 Time series analysis
Time series analysis 1 Basics
Band graph with matplotlib
How to use Matplotlib
Display TOPIX time series
I made a package to filter time series with python
Challenge to future sales forecast: ② Time series analysis using PyFlux
A study method for beginners to learn time series analysis
How to generate exponential pulse time series data in python
Reformat the timeline of the pandas time series plot with matplotlib
[Introduction to matplotlib] Read the end time from COVID-19 data ♬
Write charts in real time with Matplotlib on Jupyter notebook
Challenge to future sales forecast: ⑤ Time series analysis by Prophet