[PYTHON] Introducing Dynamic Mode Decomposition

Introduction

I could only think of Fourier transform or wavelet transform as a method to investigate the time change, but I learned that there is a good method called dynamic mode decomposition (DMD) that can extract both temporal and spatial modes. I will record it here to deepen my understanding.

The content is almost the same as the site that I referred to, and I will write a slightly modified version of Google Translate. The code related to graph drawing is not in the reference source, so I added it. You should be able to do everything from running the DMD to drawing the graph.

reference

For simplicity, we will omit the DMD of the 3D vector field and consider only a simple 1D scalar function. Dynamic Mode Decomposition in Python

I didn't know what SVD was, so I referred to this. About the relationship between PCA and SVD

Dynamic mode decomposition

Dynamic Mode Decomposition (DMD) is a relatively recent mathematical innovation that can solve or approximate dynamic systems, among other things, with respect to coherent structures that grow, decay, and / or oscillate over time. The coherent structure is called DMD mode. Each DMD mode has a corresponding time dynamic defined for a single eigenvalue.

In other words, DMD transforms a dynamic system into a superposition of modes whose dynamics are dominated by eigenvalues.

Surprisingly, the mathematical procedure for distinguishing DMD modes from eigenvalues is purely linear, but the system itself can be non-linear. Although not covered here, the claim that nonlinear systems can be described as a set of modes and eigenvalue pairs has sound rationale. For more information, read the paper on the integration of the Koopman operator and DMD [^ 1] [^ 2] [^ 3].

DMDs are not only useful diagnostic tools for analyzing the internal behavior of your system, but can also be used to predict the future state of your system. All you need is the mode and the eigenvalues. With little effort, you can combine modes and eigenvalues to generate a function that approximates the system state at any time.

Official definition

Consider n datasets. ${(x_0,y_0),(x_1,y_1),\dots (x_n,y_n)\}$

Where $ x_i $ and $ y_i $ are column vectors of magnitude $ m $, respectively. Define two $ m \ times n $ matrices. $X=[x_0\ x_1\ \dots\ x_n],\quad Y=[y_0\ y_1\ \dots\ y_n]$

If you define the operator $ A $ as follows $A=YX^\dagger$

Where $ X ^ \ dagger $ is the reciprocal of $ X $, and the dynamic mode decomposition of $ (X, Y) $ is given by the eigenvalue decomposition of $ A $. So DMD mode and eigenvalues are $ A $ eigenvectors and eigenvalues.

The definition by Tu et al. [^ 2] above is known as exact DMD. This is currently the most common definition and can be applied to any dataset that meets certain requirements. In this article, I'm most interested if $ A $ satisfies the formula $ y_i = Ax_i $ for all $ i $ (probably roughly). Or, more precisely: $Y=AX$

Obviously, $ X $ is a set of input vectors and $ Y $ is a set of corresponding output vectors. This particular interpretation of DMD is very powerful because it provides a convenient way to analyze (and predict) dynamic stems with unknown governing equations. We'll talk more about dynamic systems later.

There are several theorems that follow this definition of DMD [^ 2]. One of the more useful theorems is that when $ X $ and $ Y $ are linearly consistent (in other words, when $ Xv = 0 $ with respect to the vector $ v $, $ Yv = 0 Only if $ is also satisfied), $ Y = AX $ is completely satisfied. Testing linear consistency is relatively easy, as we will see later. In other words, linear consistency is not a mandatory prerequisite for using DMD. Even if the DMD decomposition of $ A $ does not completely satisfy the equation $ Y = AX $, it is the least squares method and minimizes the error in the $ L ^ 2 $ norm.

DMD algorithm

At first glance, the eigenvalue decomposition of $ A = YX ^ \ dagger $ does not seem to be a big issue. In fact, if the $ X $ and $ Y $ sizes are right, calling the pinv and eig methods from Numpy or MATLAB a couple of times will work. The problem arises when $ A $ is really big. Since $ A $ is $ m \ times m $, eigenvalue decomposition becomes cumbersome if $ m $ (the number of signals in each time sample) is very large.

Fortunately, with the help of the exact DMD algorithm, you can break the problem into smaller pieces.

  1. Calculate SVD (singular value decomposition) of $ X $ and at the same time perform low-level truncation if necessary: $X=U\Sigma V^*$

  2. Project the matrix $ A $ onto $ U $ to calculate $ \ tilde A : $\tilde A=U^* AU=U^*YV\Sigma^{-1}$$

  3. Calculate the eigenvalue $ \ lambda_i $ and the eigenvector $ w_i $ for $ \ tilde A : $\tilde AW=W\Lambda$$

  4. Reconstruct the eigenvalue decomposition of $ A $ from $ W $ and $ \ Lambda $. The eigenvalues of $ A $ are equivalent to the eigenvalues of $ \ tilde A $. The eigenvectors of $ A $ are given in the columns $ \ Phi . $A\Phi=\Phi\Lambda,\quad \Phi=YV\Sigma^{-1}W$$

A more detailed explanation of the derivation of the algorithm can be found in references [^ 1] [^ 2]. It may also be interesting to note that $ \ Phi = UW $ is an alternative derivation of $ \ Phi $ called projected DMD mode. This article uses only exact DMD mode.

Let's take a step-by-step look at the algorithm in Python. Start by installing and importing all the packages you need.

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from numpy import dot, multiply, diag, power
from numpy import pi, exp, sin, cos, cosh, tanh, real, imag
from numpy.linalg import inv, eig, pinv
from scipy.linalg import svd, svdvals
from scipy.integrate import odeint, ode, complex_ode
from warnings import warn

#This is my favorite
plt.rcParams["font.family"] = "Times New Roman"      #Set the entire font
plt.rcParams["xtick.direction"] = "in"               #Inward the x-axis scale line
plt.rcParams["ytick.direction"] = "in"               #Inward the y-axis scale line
plt.rcParams["xtick.minor.visible"] = True           #Addition of x-axis auxiliary scale
plt.rcParams["ytick.minor.visible"] = True           #Addition of y-axis auxiliary scale
plt.rcParams["xtick.major.width"] = 1.5              #Line width of x-axis main scale line
plt.rcParams["ytick.major.width"] = 1.5              #Line width of y-axis main scale line
plt.rcParams["xtick.minor.width"] = 1.0              #Line width of x-axis auxiliary scale line
plt.rcParams["ytick.minor.width"] = 1.0              #Line width of y-axis auxiliary scale line
plt.rcParams["xtick.major.size"] = 10                #x-axis main scale line length
plt.rcParams["ytick.major.size"] = 10                #Length of y-axis main scale line
plt.rcParams["xtick.minor.size"] = 5                 #x-axis auxiliary scale line length
plt.rcParams["ytick.minor.size"] = 5                 #Length of y-axis auxiliary scale line
plt.rcParams["font.size"] = 14                       #Font size
plt.rcParams["axes.linewidth"] = 1.5                 #Enclosure thickness

Let's generate some play data. Note that in practice you do not necessarily know the governing equation of the data. Here we are creating some equations to create a dataset. Once the data is generated, forget that they exist.

#Define time domain and spatial domain
x = np.linspace(-10, 10, 100)
t = np.linspace(0, 6*pi, 80)
dt = t[2] - t[1]
Xm,Tm = np.meshgrid(x, t)

#Create three time and spatial patterns
f1 = multiply(20-0.2*power(Xm, 2), exp((2.3j)*Tm))
f2 = multiply(Xm, exp(0.6j*Tm))
f3 = multiply(5*multiply(1/cosh(Xm/2), tanh(Xm/2)), 2*exp((0.1+2.8j)*Tm))

#Combine signals to create a data matrix
D = (f1 + f2 + f3).T

#Create a DMD I / O matrix
X = D[:,:-1]
Y = D[:,1:]
fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(111, projection="3d")
surf = ax.plot_surface(Xm, Tm, np.real(D.T), cmap="gray", linewidth=0)
ax.set_xlabel("x")
ax.set_ylabel("t")
fig.colorbar(surf)

output_8_1.png

Now calculate the SVD for $ X $. The first variable of primary interest is the $ X $ singular value, $ \ Sigma $. Obtaining the $ X $ SVD allows you to extract the "high energy" mode and reduce the dimensions of your system with proper orthogonal decomposition (Proper Orthogonal Decomposition, POD). Looking at the singular value determines the number of modes to truncate.

#Input matrix SVD
U2,Sig2,Vh2 = svd(X, False)
fig, ax = plt.subplots()
ax.scatter(range(len(Sig2)), Sig2, label="singular values")
ax.legend()

output_11_1.png

Given the singular values above, we can conclude that the data has three important modes. Therefore, truncate the SVD to include only these modes. Then build $ \ tilde A $ and find its eigenvalue decomposition.

#Rank 3 truncated
r = 3
U = U2[:,:r]
Sig = diag(Sig2)[:r,:r]
V = Vh2.conj().T[:,:r]

# A~To build
Atil = dot(dot(dot(U.conj().T, Y), V), inv(Sig))
mu,W = eig(Atil)
def circle(r=1):
    x,y = [],[]
    for _x in np.linspace(-180,180,360):
        x.append(np.sin(np.radians(_x)))
        y.append(np.cos(np.radians(_x)))
    return x,y

c_x,c_y = circle(r=1)

fig, ax = plt.subplots(figsize=(8,8))
ax.plot(c_x, c_y, c="k", linestyle="dashed")
ax.scatter(np.real(mu[0]), np.imag(mu[0]), label="1st")
ax.scatter(np.real(mu[1]), np.imag(mu[1]), label="2nd")
ax.scatter(np.real(mu[2]), np.imag(mu[2]), label="3rd")
ax.set_aspect("equal")
ax.set_xlabel(r"$\it{Re}\,\mu$")
ax.set_ylabel(r"$\it{Im}\,\mu$")
ax.legend()

output_14_1.png

Each unique value of $ \ Lambda $ tells you about the dynamic behavior of the corresponding DMD mode. Its exact interpretation depends on the nature of the relationship between $ X $ and $ Y $. In the case of difference equations, many conclusions can be drawn. If the eigenvalue has a non-zero imaginary part, there is vibration in the corresponding DMD mode. If the eigenvalues are within the unit circle, the mode is attenuated. If the eigenvalues are on the outside, the mode is growing. If the eigenvalues fit the unit circle exactly, the mode will neither grow nor decay.

Next, build the exact DMD mode.

#Build DMD mode
Phi = dot(dot(dot(Y, V), inv(Sig)), W)
fig, ax = plt.subplots()
ax.plot(x, np.real(Phi[:,0]), label="1st mode")
ax.plot(x, np.real(Phi[:,1]), label="2nd mode")
ax.plot(x, np.real(Phi[:,2]), label="3rd mode")
ax.set_xlabel("x")
ax.legend()

output_18_1.png

The $ \ Phi $ column is the DMD mode plotted above. They are a consistent structure that grows / damps / oscillates within the system according to different time dynamics. Compare the curve in the above plot with the rotating, evolving shape found in the original 3D surface plot. You will notice similarities.

This is the technical end of the DMD algorithm. With the eigenvalue decomposition of $ A $ and a basic understanding of the nature of the system $ Y = AX $, we can build the matrix $ \ Psi $ for the time evolution of the system. To fully understand the code below, see the difference equation function $ x (t) $ in the next section.

#Calculation of time evolution
b = dot(pinv(Phi), X[:,0])
Psi = np.zeros([r, len(t)], dtype='complex')
for i,_t in enumerate(t):
    Psi[:,i] = multiply(power(mu, _t/dt), b)
fig = plt.figure(figsize=(10,10))
ax1,ax2,ax3 = fig.add_subplot(311), fig.add_subplot(312), fig.add_subplot(313)
plt.subplots_adjust(wspace=0.5, hspace=0.5)
ax1.set_title("1st mode"), ax2.set_title("2nd mode"), ax3.set_title("3rd mode")

ax1.plot(t, np.real(Psi[0]), label=r"$\it{Re}\,\Psi$")
ax1.plot(t, np.imag(Psi[0]), label=r"$\it{Re}\,\Psi$")
ax2.plot(t, np.real(Psi[1])), ax2.plot(t, np.imag(Psi[1]))
ax3.plot(t, np.real(Psi[2])), ax3.plot(t, np.imag(Psi[2]))
ax3.set_xlabel("t")

fig.legend()

output_22_1.png

The above three plots are the time dynamics of the three DMD modes. Notice that all three are vibrating. In addition, the second mode appears to grow exponentially. This is confirmed by the eigenvalue plot.

To create an approximation of the original data matrix, simply multiply $ \ Phi $ by $ \ Psi $. In this particular case, the approximation matches the original exactly.

#DMD reconstruction calculation
D2 = dot(Phi, Psi)
np.allclose(D, D2) # True

Dynamic system

In this article, we will consider only two interpretations of the expression $ Y = AX $. The first interpretation is where $ A $ defines the difference equation. $x_{i+1}=Ax_i$

In this case, the operator $ A $ advances the dynamic system state $ x_i $ by one step in time. Let's say you have a time series $ D . $D=[x_0\ x_1\ \dots\ x_{n+1}]$$

Where $ x_i $ is a column vector that defines the $ m $ dimensional state of the system at the time step $ i $. Then you can define $ X $ and $ Y $ as follows: $X=[x_0\ x_1\ \dots\ x_{n}],\quad Y=[x_1\ x_2\ \dots\ x_{n+1}]$

In this way, each pair of $ X $ and $ Y $ column vectors corresponds to a single iteration of the difference equation, which is typically: $Y=AX$

Use DMD to find the eigendecomposition of $ A \ Phi = \ Phi \ Lambda $. Using DMD mode and eigenvalues, you can easily convert $ Y = AX $ to a function defined by the discrete-time iteration $ k $ of the time step $ \ Delta t . $x_k=\Phi\Lambda^k\Phi^\dagger x_0$$

The corresponding function for continuous time $ t $ is $x(t)=\Phi\Lambda^{t/\Delta t}\Phi^\dagger x(0)$

What's really amazing is that I defined an explicit function in time using only the data. This is a good example of equationless modeling.

The second interpretation of $ Y = AX $ considered in this article is where $ A $ defines the differential equation system. $\dot x=Ax$

In this case, the operator $ A $ computes the linear derivative of the vector $ x_i $ with respect to time. The matrices $ X $ and $ Y $ consist of $ n $ samples of vector fields. The $ i $ th column of $ X $ is the position vector $ x_i $. The $ i $ th column of $ Y $ is the velocity vector $ \ dot x_i $.

After calculating the DMD, the function of time is very similar to the previous one (that is, the difference equation). If $ x (0) $ is an arbitrary initial condition and $ t $ is a continuous time $x(t)=\Phi\text{exp}(\Lambda t)\Phi^\dagger x(0)$

Helper method

For convenience, combine the DMD code into one method, define several helper methods to check the linear consistency, and check the solution.

def nullspace(A, atol=1e-13, rtol=0):
    # from http://scipy-cookbook.readthedocs.io/items/RankNullspace.html
    A = np.atleast_2d(A)
    u, s, vh = svd(A)
    tol = max(atol, rtol * s[0])
    nnz = (s >= tol).sum()
    ns = vh[nnz:].conj().T
    return ns

def check_linear_consistency(X, Y, show_warning=True):
    # tests linear consistency of two matrices (i.e., whenever Xc=0, then Yc=0)
    A = dot(Y, nullspace(X))
    total = A.shape[1]
    z = np.zeros([total, 1])
    fails = 0
    for i in range(total):
        if not np.allclose(z, A[:,i]):
            fails += 1
    if fails > 0 and show_warning:
        warn('linear consistency check failed {} out of {}'.format(fails, total))
    return fails, total

def dmd(X, Y, truncate=None):
    U2,Sig2,Vh2 = svd(X, False) # SVD of input matrix
    r = len(Sig2) if truncate is None else truncate # rank truncation
    U = U2[:,:r]
    Sig = diag(Sig2)[:r,:r]
    V = Vh2.conj().T[:,:r]
    Atil = dot(dot(dot(U.conj().T, Y), V), inv(Sig)) # build A tilde
    mu,W = eig(Atil)
    Phi = dot(dot(dot(Y, V), inv(Sig)), W) # build DMD modes
    return mu, Phi

def check_dmd_result(X, Y, mu, Phi, show_warning=True):
    b = np.allclose(Y, dot(dot(dot(Phi, diag(mu)), pinv(Phi)), X))
    if not b and show_warning:
        warn('dmd result does not satisfy Y=AX')

Limitations

DMD has some known limitations. First of all, translation and rotation invariance cannot be handled particularly well. Second, if there is a temporary time operation, it can fail altogether. The following example illustrates these issues.

1. Translational invariance

The following dataset is very simple. It consists of a single mode (Gauss) that translates along the spatial domain as the system evolves. You might think that DMD handles this cleanly, but the opposite happens. Instead of getting a single, well-defined singular value, SVD gets many values.

#Define time domain and spatial domain
x = np.linspace(-10, 10, 50)
t = np.linspace(0, 10, 100)
dt = t[2] - t[1]
Xm,Tm = np.meshgrid(x, t)

#Create data in a single mode that moves both spatially and temporally
D = exp(-power((Xm-Tm+5)/2, 2))
D = D.T

#Extract the input / output matrix
X = D[:,:-1]
Y = D[:,1:]
check_linear_consistency(X, Y)

U2,Sig2,Vh2 = svd(X, False)
fig = plt.figure(figsize=(10,5))
ax1,ax2 = fig.add_subplot(121), fig.add_subplot(122)
ax1.set_xlabel("x"), ax1.set_ylabel("t")
ax1.imshow(D.T, aspect=0.5)
ax2.scatter(range(len(Sig2)), Sig2)

output_33_1.png

The plot on the left shows the system over time. The plot on the right shows the singular values. We have found that nearly 10 DMD modes are needed to accurately approximate the system. Consider the following plot. Compare true dynamics with varying numbers of overlapping modes.

def build_dmd_modes(t, X, Y, r):
    """
DMD reconstruction
    """
    mu, Phi = dmd(X, Y, truncate=r)
    b = dot(pinv(Phi), X[:,0])
    Psi = np.zeros([r, len(t)], dtype='complex')
    for i,_t in enumerate(t):
        Psi[:,i] = multiply(power(mu, _t/dt), b)
    return dot(Phi, Psi)

fig = plt.figure(figsize=(10,10))
axes = []
for i in range(9):
    axes.append(fig.add_subplot(331+i))
plt.subplots_adjust(wspace=0.5, hspace=0.5)

for i,ax in enumerate(axes):
    ax.set_title("{} modes".format(i+1))
    ax.imshow(np.real(build_dmd_modes(t, X, Y, r=i+1).T), aspect=0.5)

output_36_0.png

2. Transient time behavior

This last example examines a dataset that contains transient time dynamics. Specifically, it shows when Gauss is present and when it is not present in the data. Unfortunately, DMD cannot decompose this data accurately.

#Define time domain and spatial domain
x = np.linspace(-10, 10, 50)
t = np.linspace(0, 10, 100)
dt = t[2] - t[1]
Xm,Tm = np.meshgrid(x, t)

#Create data in a single temporary mode
D = exp(-power((Xm)/4, 2)) * exp(-power((Tm-5)/2, 2))
D = D.astype('complex').T

#Extract the input / output matrix
X = D[:,:-1]
Y = D[:,1:]
check_linear_consistency(X, Y)

#DMD decomposition
r = 1
mu,Phi = dmd(X, Y, r)
check_dmd_result(X, Y, mu, Phi)

#Time evolution
b = dot(pinv(Phi), X[:,0])
Psi = np.zeros([r, len(t)], dtype='complex')
for i,_t in enumerate(t):
    Psi[:,i] = multiply(power(mu, _t/dt), b)
fig = plt.figure(figsize=(10,10))
ax1,ax2 = fig.add_subplot(221),fig.add_subplot(222)
ax3,ax4 = fig.add_subplot(223),fig.add_subplot(224)

ax1.imshow(np.real(D), aspect=2.0), ax1.set_title("True")
ax2.imshow(np.real(dot(Phi, Psi).T), aspect=0.5), ax2.set_title("1-mode approx.")
ax3.plot(x, np.real(Phi)), ax3.set_xlabel("x"), ax3.set_title("mode1")
ax4.plot(t, np.real(Psi[0])), ax3.set_xlabel("t"), ax4.set_title("time evol. of mode 1")

output_39_1.png

The DMD correctly identifies the mode, but it cannot completely identify the behavior of time. This can be understood by considering that the time behavior of the DMD time series depends on the eigenvalues. Eigenvalues can only characterize a combination of exponential growth (the real part of the eigenvalue) and vibration (the imaginary part).

The interesting thing about this system is that the ideal decomposition can consist of a single mode (as shown) overlay with different eigenvalues. Imagine a single mode multiplied by a linear combination of many orthogonal and cosine (Fourier series) that approximates true time dynamics. Unfortunately, a single SVD-based DMD application cannot generate the same DMD mode multiple times with different eigenvalues.

Furthermore, it is important to note that even if the time behavior can be correctly extracted as a large number of eigenvalues, the solution prediction function is unreliable without a complete understanding of the transient behavior itself. Temporary behavior is not permanent by its nature.

Multi-resolution DMD (mrDMD) seeks to alleviate the problem of temporary time operation by applying DMD recursively.

Conclusion

Despite its limitations, DMD is a very powerful tool for analyzing and predicting dynamic systems. All data scientists in all backgrounds should have a good understanding of DMD and how to apply it. The purpose of this article is to provide the theory behind DMD and provide practical Python code examples that you can use with real data. We reviewed the formal definition of DMD, walked through the algorithm step by step, and tried some simple use cases, including failed ones. We hope this will give you a clearer understanding of how DMD applies to research or engineering projects.

DMD has many extensions. Future work may include posts on some of these extensions, such as multi-resolution DMD (mrDMD) and sparse DMD (sDMD).

Summary

If you want to DMD the time evolution of a 2D array, such as a high-speed camera image, flatten the 2D array into a 1D array and the above code will work. I will add an example of DMD of the simulation of Karman vortex with CFD.

Recommended Posts

Introducing Dynamic Mode Decomposition
Background / moving object separation using dynamic mode decomposition