You don't have to worry about graphing time series data from one source. However, I didn't know how to display time series data from different sources, so I tried it. In conclusion, I got the result I wanted, but it's not good enough.
The result is unpleasant, so please let me know if there is a smarter way.
As an example
It's like that. It is easy to display each data, but it is troublesome if the acquisition period and unit (number of cases,%, etc.) are different.
Please see here for how to get data from DB or CSV in the first place.
Before I talk about the details, I'm not used to Matploglib in the first place, so I'll take a quick look at it. The simplest code looks like this: It is simpler without using subplot, but I dare to write it in subplot (description for controlling multiple figures) for compatibility of the following description.
#coding:utf-8
import matplotlib.pyplot as plt
#A blank canvas? Generate a
fig = plt.figure()
#Prepare an area to draw the first figure in 1 row and 1 column
ax = fig.add_subplot(1,1,1)
#Prepare data The number of x and y must match
x = [0,1,2,3,4,5]
y = [54,35,32,44,74,45]
#Set data (plot)
ax.plot(x,y)
#Show figure
plt.show()
Well, it looks normal.
See the code supplements and explanations at the bottom.
Prepare two data with two different properties. Data and dates are automatically generated by powerful Python functions.
Daily data from June 2, 2016 to June 7, 2016. The number of data is 6.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
#0-Create a 6-element list by generating random numbers between 100
y1 = np.random.randint(0,100,6)
#2016-06-02 00:00:Generate daily datetime from 00(Generate 6)
x1 = pd.date_range('2016-06-02 00:00:00',periods=6,freq='d')
ax.plot(x1,y1)
plt.show()
Like this.
Hourly data from June 3, 2016 to June 9, 2016. The number of data is 150.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
#5-Generate a random number between 40 to generate a list of 150 elements
y2 = np.random.randint(5,40,150)
#2016-06-03 12:00:Generate datetime every hour from 00(150 pieces generated)
x2= pd.date_range('2016-06-03 12:00:00',periods=150,freq='H')
ax.plot(x2,y2)
plt.show()
Since it is an hourly unit, it feels more intense than data 1.
Although the above two data have a period of time, they have different acquisition periods and properties. At a minimum, I want to display the time axis (x axis) together.
I will display it for the time being.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
#Secure a place to draw a 2-by-1 diagram
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
#data1
y1 = np.random.randint(0,100,6)
x1 = pd.date_range('2016-06-02 00:00:00',periods=6,freq='d')
#data2
y2 = np.random.randint(5,40,150)
x2 = pd.date_range('2016-06-03 12:00:00',periods=150,freq='H')
#plot
ax1.plot(x1,y1)
ax2.plot(x2,y2)
plt.show()
It looks like that, but the time series does not match, and the graph is virtually meaningless.
I tried various things. For me, a Python beginner, the following are the limits for the time being. Dummy data (x0, y0 in this case), which is a common unit for both data, is generated and used.
Here, I tried using the data from 6/1 to 6/10.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
#Secure a place to draw a 2-by-1 diagram
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
#data0 Generate dummy data as a reference
#A list of 10 y values containing 10
#The value of x is 6/1 ~ 6/Datetime up to 10
y0 = [0]*10
x0 = pd.date_range('2016-06-01 00:00:00',periods=10,freq='d')
#data1
y1 = np.random.randint(0,100,6)
x1 = pd.date_range('2016-06-02 00:00:00',periods=6,freq='d')
#data2
y2 = np.random.randint(5,40,150)
x2 = pd.date_range('2016-06-03 12:00:00',periods=150,freq='H')
#plot
#ax1
ax1.plot(x0,y0)
ax1.plot(x1,y1)
#ax2
ax2.plot(x0,y0)
ax2.plot(x2,y2)
plt.show()
Apparently, the time axis is correct, but it's hard to see because of the captions.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
fig = plt.figure()
#Secure a place to draw a 2-by-1 diagram
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
#data0
y0 = [0]*10
x0 = pd.date_range('2016-06-01 00:00:00',periods=10,freq='d')
#data1
y1 = np.random.randint(0,100,6)
x1 = pd.date_range('2016-06-02 00:00:00',periods=6,freq='d')
#data2
y2 = np.random.randint(5,40,150)
x2 = pd.date_range('2016-06-03 12:00:00',periods=150,freq='H')
#plot
#ax1
ax1.plot(x0,y0)
ax1.plot(x1,y1,'r')
#ax2
ax2.plot(x0,y0)
ax2.plot(x2,y2,'b')
#Plastic surgery
#ax1
ax1.set_xticks(x0)
ax1.set_xticklabels(x0,rotation=90,size="small")
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax1.grid()
#ax2
ax2.set_xticks(x0)
ax2.set_xticklabels(x0,rotation=90,size="small")
ax2.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax2.grid()
#Prevents vertical captions from being covered
plt.subplots_adjust(hspace=0.7,bottom=0.2)
plt.show()
The caption was made vertical, and only the date was displayed. I also changed the color of the graph. Personally, this is enough.
In many cases, it is not necessary to divide the graph into two graphs, so try displaying them in one graph.
#coding:utf-8
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
fig = plt.figure()
#1 row 1 column diagram
ax1 = fig.add_subplot(1,1,1)
#Add layer (like?)
ax2 = ax1.twinx()
#data0
y0 = [0]*10
x0 = pd.date_range('2016-06-01 00:00:00',periods=10,freq='d')
#data1
y1 = np.random.randint(0,100,6)
x1 = pd.date_range('2016-06-02 00:00:00',periods=6,freq='d')
#data2
y2 = np.random.randint(5,40,150)
x2 = pd.date_range('2016-06-03 12:00:00',periods=150,freq='H')
#plot
#ax1
ax1.plot(x0,y0)
ax1.plot(x1,y1,'r')
#ax2
ax2.plot(x0,y0)
ax2.plot(x2,y2,'b')
#Plastic surgery
#ax1
ax1.set_xticks(x0)
ax1.set_xticklabels(x0,rotation=90,size="small")
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax1.grid()
#Axis format adjustment
ax1.set_ylabel('pv', color='r')
for tl in ax1.get_yticklabels():
tl.set_color('r')
#ax2
ax2.set_xticks(x0)
ax2.set_xticklabels(x0,rotation=90,size="small")
ax2.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax2.grid()
#Axis format adjustment
ax2.set_ylabel('cpu', color='b')
for tl in ax2.get_yticklabels():
tl.set_color('b')
#Prevents vertical captions from being covered
plt.subplots_adjust(hspace=0.7,bottom=0.2)
plt.show()
Well, it looks like this. I changed the color of the axis to make it easier to understand which data it is (although it is difficult to understand).
In the above example, since x data was generated by date_range () of pandas, it is datetime type from the beginning, but when it is obtained from CSV or DB, it is a character string. It seems that it can handle strings as it is, but if necessary, convert it to datetime.
The conversion from the string list to the datetime list is as follows.
y1 = [0.43,0.26,0.33]
d1 = ['2016-06-21 12:00:00','2016-06-23 09:00:00','2016-06-26 18:00:00']
x1 = [dt.datetime.strptime(d,'%Y-%m-%d %H:%M:%S') for d in d1]
I will explain how to edit the x-axis.
It is a little difficult to understand if it is a date, but for example, if the horizontal axis is 0 to 1000 and you want to display the unit only in 3 places of 300, 600 and 900, use set_xticks ([300,600,900]).
Here, since I want to insert captions daily from 6/1 to 6/10, x0 generated by the dummy is substituted as it is.
ax1.set_xticks(x0)
After deciding where to put the caption, the next step is to decide the display content, display format, etc. For example, in the above, the position is set_xticks ([300,600,900]), but if you want to set it to small, medium, or large on the display, set_xticklabels (['small','medium','large']). You can specify the slope of the character with rotation. If it is 90, it will be vertical. The size remains the same.
Here, we want to display the dates from 6/1 to 6/10 as they are, so we substitute x0 as they are.
ax1.set_xticklabels(x0,rotation=90,size="small")
When x0 is displayed normally, all the year, month, day, hour, minute, and second are displayed and it is long, so only the year, month, and day are displayed.
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
If you want to color the lines of the data, do as follows. In this example it will be red.
ax1.plot(x1,y1,'r')
If you want to specify the axis and caption and add color, do as follows. In this example it will be blue.
ax2.set_ylabel('cpu', color='b')
for tl in ax2.get_yticklabels():
tl.set_color('b')
hspace adjusts the vertical spacing between graphs. It seems to be a unit when the height of the fluff is 1.0. bottom is the bottom margin.
plt.subplots_adjust(hspace=0.7,bottom=0.2)
Recommended Posts