What is Kalman filter?

It is a technique to tear off a persona named noise and estimate its true appearance.

Theoretically

As a typical setting, consider a system like the one below.

\begin{align}
x_{t+1} &= F_t x_t + G_t w_t \tag{1.1}\\
y_t &= H_t x_t + v_t \tag{1.2}\\
\end{align}

$ x_t $ represents the internal state of the system and $ y_t $ is its observation. The internal state $ x_t $ evolves over time according to $ (1.1) $. On the other hand, the observed value $ y_t $ is swept out with noise according to $ (1.2) $. The purpose is to infer an internal state $ x_t $ that cannot be observed directly from the observed value $ y_t $. For ease of configuration, let's assume that $ w_t $ and $ v_t $ each follow an independent Gaussian distribution and are autocorrelated only at the same time. Using the independence of $ w_t $ and $ v_t $ and Bayes' theorem

\begin{align}
p(x_t|Y^t) &= \frac{p(y_t|x_t)p(x_t|Y^{t-1})}{p(y_t|Y^{t-1})} \tag{2.1}\\
p(x_{t+1}|Y^t) &= \int p(x_{t+1}|x_t)p(x_t|Y^t)dx_t \tag{2.2}
\end{align}

The relational expression is obtained. Where $ Y ^ t $ is the value of $ y_t $ up to time $ t $, that is

Y^t=\left\{y_1, y_2, ..., y_t\right\}

Suppose that Now,(2.1)From the timetObservations up toY^tIn the state that I knewx_tEstimated value of\hat{x}\_{t|t}It can be obtained. But,(2.1)Looking at the right side ofp(x_t|Y^{t-1})Is included. for that reason,\hat{x}\_{t|t}To get the timet-1Observations up toY\^{t-1}In the state wherex_tEstimated value of\hat{x}\_{t|t-1}You need to know. this\hat{x}\_{t|t-1}Is(2.2)You can get more,(2.2)Looking at the right side of、\hat{x}\_{t|t-1}を得るにIsさらに\hat{x}\_{t-1|t-1}You can see that is needed. From the above,\hat{x}\_{t|t-1}When\hat{x}\_{t|t}While calculating alternatelyx\_tの推定を行うWhenいう流れになるこWhenが納得できるはずです。

On the other hand, if the observed values up to the end time are obtained

p(x_t|Y^N) = \int \frac{p(x_{t+1}|x_t)p(x_t|Y^t)p(x_{t+1}|Y^N)}{p(x_{t+1}|Y^t)}dx_{t+1} \tag{2.3}

Can also be derived. here,NIs the end time.(2.3)From all observationsY^NIn the state wherex_tEstimated value of\hat{x}\_{t|N}It can be obtained. However, the molecule on the right side/Looking at the third item,\hat{x}\_{t+1|N}I know that I need to ask for it in advance. Therefore,\hat{x}\_{t|N}The process of finding is in the reverse order of timet=NFromt=1Proceed towards. Also,(2.3)Molecules on the right side of/第2項Fromは、\hat{x}\_{t|t}We know that we also have to find it, so we will make a forward estimate of time followed by a reverse estimate. This process of going in the opposite direction is called smoothing.

Experiment with a random walk

Random walk

As a simple example, consider the case of a random walk. The formulas for $ (1.1) $ and $ (1.2) $ are

\begin{align}
x_{t+1} &= x_t + w_t \tag{3.1}\\
y_t &= x_t + v_t \tag{3.2}
\end{align}

It will be. However, assume that $ w_t $ and $ v_t $ each follow an independent Gaussian distribution and have autocorrelation only at the same time. You can think of $ (3.1) $ as the true movement of the random walk, and $ (3.2) $ as the noise added to the observations. I only know the observed $ y_t $, but I want to guess the original $ x_t $ by removing the noise.

By the way, let's make a random walk with python. The following three packages are sufficient.

import numpy as npimport numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context('poster')

We will implement the above formula. First, make a random walk of $ (3.1) $, and then add noise like $ (3.2) $.

def random_walker(start_position=0, mean=0, deviation=1, n_steps=99, seed=None):
    
    if seed is not None:
        np.random.seed(seed=seed)
    
    move = np.random.normal(loc=mean, scale=deviation, size=n_steps)
    position = np.insert(move, 0, start_position)
    position = np.cumsum(position)
    
    return position



def add_noise(position, mean=0, deviation=10, seed=None):
    
    if seed is not None:
        np.random.seed(seed=seed)
    
    n_observation = len(position)
    noise = np.random.normal(loc=mean, scale=deviation, size=n_observation)
    observation = position + noise
    
    return observation

Let's actually generate a random walk with noise. The start position is 0. Also, let the actual random walk variance ($ w_t $ variance) be 1 and the noise variance ($ v_t $ variance) be 10. The average is 0 for both. Set the time step to 1 ~ 100.

true_position = random_walker(start_position=0, mean=0, deviation=1, n_steps=99, seed=0)
observed_position = add_noise(true_position, mean=0, deviation=10, seed=0)

Let's make a graph.

plt.plot(true_position, 'r--', label='True Positions')
plt.plot(observed_position, 'y', label='Observed Ppositions')
plt.title('Random Walk')
plt.xlabel('time step')
plt.ylabel('position')
plt.legend(loc='best')

It's a wonderful random walk. True Positions is the true value of the random walk $ x_t $, and Observed Positions is the noisy observation $ y_t $.

The observed values are dirty.

Kalman filter

Let's write down the Kalman filter formula for this random walk model. Suppose $ w_t $ follows a Gaussian distribution with mean 0 and variance $ q $, and $ v_t $ follows a Gaussian distribution with mean 0 and variance $ r . Derivation is omitted, but in the forward direction of time\hat{x}_{t|t-1}、\hat{x}_{t|t}$The formula for finding

\begin{align}
&\hat{x}_{t/t} = \hat{x}_{t/t-1} + K_t(y_t - \hat{x}_{t/t-1}) \\
&\hat{x}_{t+1/t} = \hat{x}_{t/t} \\
&K_t = \frac{P_{t/t-1}}{P_{t/t-1} + r} \\
&P_{t/t} = \frac{r P_{t/t-1}}{P_{t/t-1} + r} \\
&P_{t+1/t} = P_{t/t} + q \\
\end{align}

The formula for finding $ \ hat {x} \ _ {t | N} $ in the reverse order of time is

\begin{align}
&\hat{x}_{t/N} = \hat{x}_{t/t} + C_t(\hat{x}_{t+1/N} - \hat{x}_{t+1/t}) \\
&C_t = \frac{P_{t/t}}{P_{t/t} + q} \\
&P_{t/N} = P_{t/t} + C^2_t(P_{t+1/N} - P_{t+1/N})
\end{align}

It will be. Let's implement this.

class Simple_Kalman:
    
    def __init__(self, observation, start_position, start_deviation, deviation_true, deviation_noise):
       
        self.obs = observation
        self.n_obs = len(observation)
        self.start_pos = start_position
        self.start_dev = start_deviation
        self.dev_q = deviation_true
        self.dev_r = deviation_noise
        
        self._fit()

        
    def _forward(self):
        
        self.x_prev_ = [self.start_pos]
        self.P_prev_ = [self.start_dev]
        self.K_ = [self.P_prev_[0] / (self.P_prev_[0] + self.dev_r)]
        self.P_ = [self.dev_r * self.P_prev_[0] / (self.P_prev_[0] + self.dev_r)]
        self.x_ = [self.x_prev_[0] + self.K_[0] * (self.obs[0] - self.x_prev_[0])]
        
        for t in range(1, self.n_obs):
            self.x_prev_.append(self.x_[t-1])
            self.P_prev_.append(self.P_[t-1] + self.dev_q)
            
            self.K_.append(self.P_prev_[t] / (self.P_prev_[t] + self.dev_r))
            self.x_.append(self.x_prev_[t] + self.K_[t] * (self.obs[t] - self.x_prev_[t]))
            self.P_.append(self.dev_r * self.P_prev_[t] / (self.P_prev_[t] + self.dev_r))

            
    def _backward(self):
        
        self.x_all_ = [self.x_[-1]]
        self.P_all_ = [self.P_[-1]]
        self.C_ = [self.P_[-1] / (self.P_[-1] + self.dev_q)]
        
        for t in range(2, self.n_obs + 1):
            self.C_.append(self.P_[-t] / (self.P_[-t] + self.dev_q))
            self.x_all_.append(self.x_[-t] + self.C_[-1] * (self.x_all_[-1] - self.x_prev_[-t+1]))
            self.P_all_.append(self.P_[-t] + (self.C_[-1]**2) * (self.P_all_[-1] - self.P_prev_[-t+1]))
        
        self.C_.reverse()
        self.x_all_.reverse()
        self.P_all_.reverse()

            
    def _fit(self):
        self._forward()
        self._backward()

Now, let's apply it to the random walk. In practice, you usually don't know the true value of the random walk or noise variance and need to estimate it. For now, let's use the true value here.

kf = Simple_Kalman(observed_position, start_position=0, start_deviation=1, deviation_true=1, deviation_noise=10)

Let's make a graph.

plt.plot(true_position, 'r--', label='True Positions')
plt.plot(observed_position, 'y', label='Observed Ppositions')
plt.plot(kf.x_, 'blue' ,label='Foward Estimation')
plt.plot(kf.x_all_, 'black', label='Smoothed Estimation')
plt.title('Random Walk')
plt.xlabel('time step')
plt.ylabel('position')
plt.legend(loc='best')

Forward Optimization\hat{x}\_{t|t}, Smoothed Estimate\hat{x}\_{t|N}Represents.

It has become beautiful.

References

[Nonlinear Kalman Filter](https://www.amazon.co.jp/%E9%9D%9E%E7%B7%9A%E5%BD%A2%E3%82%AB%E3%83%AB%E3% 83% 9E% E3% 83% B3% E3% 83% 95% E3% 82% A3% E3% 83% AB% E3% 82% BF-% E7% 89% 87% E5% B1% B1-% E5% BE% B9 / dp / 4254201486 /) Asakura Shoten, Toru Katayama

The formulas are based on this book. It's a wonderful cover with noise removed.

[PYTHON] Kalman filter that you can understand