Custom state space model in Python

Introduction

Among the state space models, there were few sites that explained the custom model in Japanese (I could not find it), so I will post the results of my own research.

statsmodels (especially custom state-space models) have a hard time defining models, so I had a hard time around that.

I would appreciate it if you could point out any incorrect descriptions.

What is a state space model?

Observation equations and equations of state

The structure of the model consists of an observation equation (there is also a document described as an observation model) and an equation of state (there is also a document described as a system model).

Observation equation: $ y_t = Z_t (x_ {t}) + d_t + \ epsilon_t $ Equation of state: $ x_ {t + 1} = T_t (x_ {t}) + c_t + R_t \ eta_ {t} $

Past, present and future

Although detailed explanation is omitted, the state space model makes the following estimates for the past, present, and future.

Significance of custom model

Among the state space models, the significance of the custom model is as follows.

Model used this time

Observation equation:  y_t = x_{t} + \epsilon_t, \hspace{10pt}\epsilon_t\sim N(0,H_t) Equation of state:  x_{t} = T_t x_{t-1} + A\cdot\exp\left(\frac{B}{z}\right) + \eta_{t}, \hspace{10pt}\eta_{t}\sim N(0,Q_t)

The difference from the above equation is that $ Z_t $ is fixed in the observation equation, $ d_t $ in the constant term is omitted, and $ c_t $ in the constant term in the equation of state is $ A \ cdot . Replaced with exp \ left (\ frac {B} {z} \ right) $.

Library import etc.

Now here is the Python code.

import numpy as np
import pandas as pd
import datetime
import statsmodels.api as sm
import matplotlib.pyplot as plt
%matplotlib inline
plt.rc("figure", figsize=(16,8))
plt.rc("font", size=15)

Data creation for testing

Create data for testing. (When actually using it, such work is unnecessary, so detailed explanation is omitted.)

def gen_data_for_model1():
    nobs = 1000

    rs = np.random.RandomState(seed=93572)

    Ht = 5
    Tt = 1.001
    A = 0.1
    B = -0.007
    Qt = 0.01
    var_z = 0.1

    et = rs.normal(scale=Ht**0.5, size=nobs)
    z = np.cumsum(rs.normal(size=nobs, scale=var_z**0.05))
    Et = rs.normal(scale=Qt**0.5, size=nobs)

    xt_1 = 50
    x = []
    
    for i in range(nobs):
        xt = Tt * xt_1
        x.append(xt)
        xt_1 = xt
    xt = np.array(x)
    xt = xt + A * np.exp(B/z) + Et
    yt = xt + et
    
    return yt, xt, z

yt, xt, z = gen_data_for_model1()
_ = plt.plot(yt,color = "r")
_ = plt.plot(xt, color="b")

data.png

Store data in DataFrame. Set index to datetime to make predictions

df = pd.DataFrame()
df['y'] = yt
df['x'] = xt
df['z'] = z
st = datetime.datetime.strptime("2001/1/1 0:00", '%Y/%m/%d %H:%M')
date = []
for i in range(1000):
    if i == 0:
        d = st
    dt = d.strftime('%Y/%m/%d %H:%M')
    date.append(dt)
    d += datetime.timedelta(days=1)
df['date'] = date
df['date'] = pd.to_datetime(df['date'] )
df = df.set_index("date")
df

df.png

Model definition

class custom(sm.tsa.statespace.MLEModel):
    param_names = ['T', 'A', 'B', 'Ht', 'Qt'] #Specify parameters to estimate
    start_params = [1., 1., 0., 1., 1] #Initial value of the parameter to be estimated

    def __init__(self, endog, exog):
        exog = np.squeeze(exog)
        super().__init__(endog, exog=exog, k_states=1, initialization='diffuse')
        self.k_exog = 1
        
        # Z = I,The part corresponding to Zt in the above equation becomes the identity matrix this time.
        self['design', 0, 0] = 1.
        # R = I,The part corresponding to Rt in the above equation becomes the identity matrix this time.
        self['selection', 0, 0] = 1.
        
        # c_Set t to time
        self['state_intercept'] = np.zeros((1, self.nobs))
        
    def clone(self, endog, exog, **kwargs):
        return self._clone_from_init_kwds(endog, exog=exog, **kwargs)
        
    def transform_params(self, params):
        #The variance needs to be a positive number, so square it once
        params[3:] = params[3:]**2
        return params
        
    def untransform_params(self, params):
        #Squared 1/Square to return to original size (square root)
        params[3:] = params[3:]**0.5
        return params
    
    def update(self, params, **kwargs):
        #It is a specification to update
        params = super().update(params, **kwargs)
        # T = T
        self['transition', 0, 0] = params[0] 
        # c_t = A * exp(B / z_t)
        self['state_intercept', 0, :] = params[1] * np.exp(params[2] / self.exog)       
        # Ht
        self['obs_cov', 0, 0] = params[3]
        # Qt
        self['state_cov', 0, 0] = params[4]

Divide into estimation data and prediction data

df_test = df.iloc[:800,:]
df_train = df.iloc[800:,:]

Learning, summary confirmation

mod = custom(df_test['y'], df_test['z'])
res = mod.fit()
print(res.summary())

サマリー.png

The coef part is the estimated value, but it is close to the numerical value of the data created at the beginning.

Store the estimation result in DataFrame

ss = pd.DataFrame(res.smoothed_state.T, columns=['x'], index=df_test.index)

Estimate and forecast

predict = res.get_prediction()
forecast = res.get_forecast(df.index[-1], exog = df_train['z'].values)

When you make a prediction, you need to specify to what point you want to make a prediction. (This time specified by df.index [-1]) Also, when using exogenous variables in the model, it is necessary to specify the data with the argument of "exog =" at the time of prediction. This time, all of them are dummy data, but if you want to actually use them in prediction, you need to create dummy data from past data. Data with uneven data spacing cannot be used for prediction. (The same can be said for SARIMAX, which also uses exogenous variables.)

Visualization

fig, ax = plt.subplots()

y = df['y']

y.plot(ax=ax,label='y')
predict.predicted_mean.plot(label='x')
predict_ci = predict.conf_int(alpha=0.05)
predict_index = predict_ci.index
ax.fill_between(predict_index[2:], predict_ci.iloc[2:, 0], predict_ci.iloc[2:, 1], alpha=0.1)

forecast.predicted_mean.plot(ax=ax, style='r', label='forecast')
forecast_ci = forecast.conf_int()
forecast_index = forecast_ci.index
ax.fill_between(forecast_index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1], alpha=0.1)

legend = ax.legend(loc='best');
fig.savefig('custom_statespace.png')

可視化.png

The light blue part is the estimated 95% confidence interval and the pink part is the predicted 95% confidence interval.

References

This article was written with reference to the following information.

Thanks to Chad Fulton, the developer of statsmodels!

Recommended books for state space models

We will also introduce recommended books other than those listed above. They are listed in order of relative readability.

Books I bought but haven't read yet. .. ..

Revision history

October 3, 2020: First edition

Recommended Posts

Custom state space model in Python
Custom sort in Python3
Visualize Keras model in Python 3.5
General Gaussian state-space model in Python
Learn the design pattern "State" in Python
Implement a Custom User Model in Django
Python in optimization
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
Meta-analysis in Python
Unittest in python
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Plink in Python
Constant in python
Lifegame in Python.
FizzBuzz in Python
StepAIC in Python
N-gram in python
Csv in python
Disassemble in Python
Reflection in Python
Implementation of particle filters in Python and application to state space models
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Quad-tree in Python
Chemistry in Python
Hashable in python
DirectLiNGAM in Python
LiNGAM in Python
Flatten in python
flatten in python
I tried to implement TOPIC MODEL in Python
Create a simple momentum investment model in Python
Use a custom error page in python / tornado
Transfer function / state space model of RLC series circuit and simulation by Python
Daily AtCoder # 36 in Python
Clustering text in Python
Implement Enigma in python
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Daily AtCoder # 18 in Python
Edit fonts in Python
Singleton pattern in Python
File operations in Python
Read DXF in python
Daily AtCoder # 53 in Python
Transfer function / state space model of spring / mass / damper system and simulation by Python