[PYTHON] Fitting to ARMA, ARIMA model

Introduction

In this paper, we describe how to fit given time series data to AR model, AM model, and ARMA model using python.

Function to use

The function statsmodels.tsa.arima_model.ARMA.fit is used. Click here for details (https://www.statsmodels.org/devel/generated/statsmodels.tsa.arima_model.ARMA.fit.html#statsmodels.tsa.arima_model.ARMA.fit)

AR model fitting

As an example, we will fit the AR (1) model.

y_t = 1 + 0.5 y_{t-1} + \epsilon_t

However, $ \ epsilon_t $ is the normal white noise with variance 1. Also, let $ y_0 = 2 $.

#A magical spell that makes module capture and graphs look good
import matplotlib as mpl
import matplotlib.pyplot as plt
plt.style.use('seaborn')
mpl.rcParams['font.family'] = 'serif'
%matplotlib inline
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.api as smt
p = print

#Creating a column of data to plot
#This time, capture the data at 1000 times
y = np.zeros(1000)
np.random.seed(42)
epsilon = np.random.standard_normal(1000)
y[0] = 2
for t in range(1,1000):
    y[t] = 1 + 0.5 * y[t-1] + epsilon[t]

#Take a look at the time series data to plot
plt.plot(y)
plt.xlabel('time')
plt.ylabel('value')
plt.title('time-value plot');

The following graph is plotted. img.png

Fit this model.

mdl = smt.ARMA(y, order=(1, 0)).fit()
p(mdl.summary())

The result is as follows. キャプチャ.PNG It can be seen that the constant term is 2.0336 and the coefficient of the AR model is 0.4930, which is close to the actual value of 2,0.5. In addition, this model was an AR model that included the constant term 2, but if it is known that the constant term is 0,

mdl = smt.ARMA(y, order=(1, 0)).fit(trend='nc')

And it is sufficient.

MA model fitting

All you have to do is change the number in the order =, part. For example, when fitting to MA (1),

mdl = smt.ARMA(y, order=(0, 1)).fit()

And it is sufficient

ARMA model fitting

When fitting to ARMA (1, 1)

mdl = smt.ARMA(y, order=(1, 1)).fit()

And it is sufficient

When the order of ARMA cannot be considered in the first place

The function sm.tsa.arma_order_select_ic can be used to estimate the optimal order based on the information criterion. Click here for details on the function (https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.arma_order_select_ic.html)

ARMA model order estimation

The order is estimated using the time series data of the AR (1) model model described above. In other words, if you can estimate that the order is (1,0), you are successful.

from statsmodels.tsa.arima_process import arma_generate_sample

y = np.zeros(1000)
np.random.seed(42)
epsilon = np.random.standard_normal(1000)
y[0] = 2
for t in range(1,1000):
    y[t] = 1 + 0.5 * y[t-1] + epsilon[t]

sm.tsa.arma_order_select_ic(y)

The result is as follows. キャプチャ.PNG

The order (1,0) was optimally estimated. The matrix represents the value of BIC The rows represent the degree of AR and the columns represent the degree of AM.

If you want to use AIC, or if you want to evaluate with both AIC and BIC, describe as follows.

#When using AIC
sm.tsa.arma_order_select_ic(y, ic='aic')
#When you want to evaluate at the same time based on two information criteria
sm.tsa.arma_order_select_ic(y, ic=['aic', 'bic'])

It is also possible to survey by changing the maximum value of the order, or to estimate after assuming that the constant term = 0.

Recommended Posts

Fitting to ARMA, ARIMA model
Time series analysis 2 Stationary, ARMA / ARIMA model
[Introduction to SEIR model] Try fitting COVID-19 data ♬
Model fitting with lmfit
[Introduction to infectious disease model] I tried fitting and playing ♬
[Introduction to SIR model] Consider the fitting result of Diamond Princess ♬
How to convert Tensorflow model to Lite