1. Purpose

Make a note of how to display the graph on Jupyter using the library Seaborn.

2. Contents

2-1 Display time series data.

Draw a time series graph using SEABORN.

`sample.py`


import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns

#Define a Data set.(The date is datetime.Describe in datetime. Notated by date type)
dat = [
    [datetime.datetime(2020,1,1),4,10],
    [datetime.datetime(2020,1,2),7,7],
    [datetime.datetime(2020,1,3),10,4],
    [datetime.datetime(2020,1,4),13,2],
    [datetime.datetime(2020,1,5),17,1],
    [datetime.datetime(2020,1,6),12,4],
    [datetime.datetime(2020,1,7),9,3],
    [datetime.datetime(2020,1,8),7,8],
    [datetime.datetime(2020,1,9),5,9],
    [datetime.datetime(2020,1,10),3,12],
    
    
]

dat=pd.DataFrame(dat,columns=["DATE","Y","Z"])
dat.set_index("DATE",inplace=True) #Set the date displayed on the horizontal axis to the index of DataFrame.
print(dat)


fig = sns.mpl.pyplot.figure() #Create an object to draw the graph.
ax = fig.add_subplot(111) #Set the area to display the graph(Number of lines,Number of columns,Target graph number)
ax.plot(dat['Y'], label='Y',markersize=10,c="blue",marker="o") #Give the data and display the graph.
ax.plot(dat['Z'], label='Z',markersize=10,c="red",marker="o") #Give the data and display the graph.
ax.legend() #Draw a legend

#Graph format settings(Set the date display method on the horizontal axis.)
days    = mdates.DayLocator(bymonthday=None, interval=2, tz=None)  #Horizontal axis: "Everyday" is displayed.(Without this line the date will be duplicated)
daysFmt = mdates.DateFormatter('%Y-%m-%d') #Horizontal axis: Format Y-M-Set to D.
ax.xaxis.set_major_locator(days) #Display the date on the horizontal axis.
ax.xaxis.set_major_formatter(daysFmt) #Display the date on the horizontal axis.
fig.autofmt_xdate() #The date on the horizontal axis is slanted so that it is easy to see.

#Give the graph a name
ax.set_xlabel('Date') #Set the X-axis title
ax.set_ylabel('Y') #Set the Y-axis title
plt.title(r"TEST",fontname="MS Gothic")  #Set the title of the graph. When specifying Japanese, it is necessary to specify fontname
#Set the size of the graph
fig.set_figheight(10)
fig.set_figwidth(20)
#Set the display range on the horizontal axis
ax.set_xlim(datetime.datetime(2020,1,1), datetime.datetime(2020,1,12))

Execution result

`python`


             Y   Z
DATE              
2020-01-01   4  10
2020-01-02   7   7
2020-01-03  10   4
2020-01-04  13   2
2020-01-05  17   1
2020-01-06  12   4
2020-01-07   9   3
2020-01-08   7   8
2020-01-09   5   9
2020-01-10   3  12

2-2 Display the scatter plot data.

`sample.py`


import seaborn as sns
sns.set_style("whitegrid")
df1 = pd.DataFrame({'X': [1, 2, 3,4,5],'Y': [4, 5.5, 6.2,7.3,7.8]})
sns.regplot('X', 'Y', data=df1,fit_reg=False)
ax.set_yscale("log") #Log notation of Y axis.
ax.set_xlim(0, 5)    #Set the range of the X axis.
ax.set_ylim(1, 10)   #Set the Y-axis range.

reference https://pythondatascience.plavox.info/seaborn/%E6%95%A3%E5%B8%83%E5%9B%B3%E3%83%BB%E5%9B%9E%E5%B8%B0%E3%83%A2%E3%83%87%E3%83%AB

2-3 Create a scatter plot with a histogram.

`sample.py`


import pandas as pd
import numpy as np

#Create data for testing.(Random numbers that follow a two-dimensional normal distribution are generated.)
mean=[0,0]
cov=[[1,0],[0,10]]
dset=np.random.multivariate_normal(mean,cov,1000) #Generate 1000 data
df=pd.DataFrame(dset,columns=['X','Y'])


#Set the background to white
sns.set(style="white", color_codes=True)

#Output graph
sns.jointplot(x="X", y="Y", data=df)

2-4 Arrange multiple graphs

`sample.py`


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#Create data for testing.(Random numbers that follow a two-dimensional normal distribution are generated.)
mean=[0,0]
cov=[[1,0],[0,10]]
dset=np.random.multivariate_normal(mean,cov,1000) #Generate 1000 data
df0=pd.DataFrame(dset,columns=['X','Y'])
df1=pd.DataFrame(dset,columns=['X','Y'])

fig,(axis1,axis2)=plt.subplots(1,2,sharey=True) #Create a 1-by-2 graph placement location
sns.regplot('X','Y',df0,ax=axis1)
sns.regplot('X','Y',df0,ax=axis2)

2-5 Plot 1-point data.

For plotting only one point, use .plot because replot cannot be used (dataset)

`sample.py`


import numpy as np
import pandas as pd
import seaborn as sns

#Definition of sample data
dat = [
    [1,8],
    [4,2],
    [7,6],
    [4,8],
    [20,15],
    [3,7]
]

dat1=[[4,4]]

df0 = pd.DataFrame(dat,columns=["X","Y"]) #1st dataset
df1 = pd.DataFrame(dat1,columns=["X","Y"]) #Second dataset


ax=sns.regplot('X', 'Y', data=df0,fit_reg=False)  #1st dataset
ax.set_xlim(0, 5)    #Set the range of the X axis.
ax.set_ylim(1, 10)   #Set the Y-axis range.


#Plot the second dataset.
#If you are plotting only one point, you cannot use replot.(data set).Use plot.
df1.plot(kind="scatter",x="X", y="Y",s=500,c="yellow",marker="*", alpha=1, linewidths=2,edgecolors="red",ax=ax) 
#s is the size of the mark,alpha is transparency(0:Transparent,1：不Transparent)

Execution result

3 Statistical processing method

3-1 How to write kernel density function

Description of kernel density function https://www.ie-kau.net/entry/kernel_density_est

`/home/sampletest/sample.py`


from numpy.random import randn
import seaborn as sns
import numpy as np

dataset=randn(100) #Generate 100 random numbers that follow a uniform distribution.
sns.kdeplot(dataset)

sns.rugplot(dataset,color='black') #The dataset is being plotted.
for bw in np.arange(0.5,2.5,0.5): #Bandwidth 0.5,1.0,1.5,2.Write the kernel density function by changing it to 0
    sns.kdeplot(dataset,bw=bw,label=bw)

[PYTHON] How to use Seaboan

1. Purpose

2. Contents

2-1 Display time series data.

sample.py

python

2-2 Display the scatter plot data.

sample.py

2-3 Create a scatter plot with a histogram.

sample.py

2-4 Arrange multiple graphs

sample.py

2-5 Plot 1-point data.

sample.py

3 Statistical processing method

3-1 How to write kernel density function

/home/sampletest/sample.py

`sample.py`

`python`

`sample.py`

`sample.py`

`sample.py`

`sample.py`

`/home/sampletest/sample.py`