The Choresky decomposition of the covariance matrix can convert random numbers that follow an independent standard normal distribution to random numbers that follow a multivariate normal distribution. What is it?
--Change of variables of random variables --Derivation of multidimensional normal distribution --Characteristics of covariance matrix --Cholesky decomposition algorithm
To explain.
Random variable
\begin{eqnarray}
X_1,\ X_2,\ \cdots ,\ X_n
\end{eqnarray}
, Random variables
\begin{eqnarray}
Y_1,\ Y_2,\ \cdots ,\ Y_n
\end{eqnarray}
Convert to. However, $ X $ and $ Y $ are respectively
\begin{eqnarray}
&f_X&\left(x_1,\ x_2,\ \cdots ,\ x_n\right)\\
&f_Y&\left(y_1,\ y_2,\ \cdots ,\ y_n\right)
\end{eqnarray}
Suppose it has a probability density function of.
Each $ Y $ as a bijection of $ X $
\begin{eqnarray}
\begin{cases}
Y_1 = h_1\left(X_1,\ X_2,\ \cdots ,\ X_n\right)&\\
Y_2 = h_2\left(X_1,\ X_2,\ \cdots ,\ X_n\right)&\\
\ \vdots &\\
Y_n = h_n\left(X_1,\ X_2,\ \cdots ,\ X_n\right)
\end{cases}
\end{eqnarray}
It is expressed as. Since each $ h_i $ is bijective, there is an inverse function,
\begin{eqnarray}
\begin{cases}
X_1 = g_1\left(Y_1,\ Y_2,\ \cdots ,\ Y_n\right)&\\
X_2 = g_2\left(Y_1,\ Y_2,\ \cdots ,\ Y_n\right)&\\
\ \vdots &\\
X_n = g_n\left(Y_1,\ Y_2,\ \cdots ,\ Y_n\right)
\end{cases}
\end{eqnarray}
It is expressed as. Change of variables of integrals to any subset $ D $ of $ n $ dimensional real space
\begin{eqnarray}
&\int\cdots\int_D& f_X\left(x_1,x_2,\cdots ,x_n\right) dx_1dx_2\cdots dx_n \nonumber \\
= &\int\cdots\int_D& f_X\left(y_1,y_2,\cdots ,y_n\right)\left|\frac{\partial\left( x_1,x_2,\cdots x_n\right) }{\partial\left(y_1,y_2,\cdots y_n\right) }\right| dy_1dy_2\cdots dy_n
\end{eqnarray}
Is established. However, Jacobian,
\begin{eqnarray}
\left|\frac{\partial\left( x_1,x_2,\cdots x_n\right) }{\partial\left(y_1,y_2,\cdots y_n\right) }\right| =
\begin{pmatrix}
\frac{\partial g_1}{\partial y_1}&\frac{\partial g_1}{\partial y_2}&\cdots&\frac{\partial g_1}{\partial y_n}\\
\frac{\partial g_2}{\partial y_1}&\ddots&&\vdots\\
\vdots &&\ddots&\vdots\\
\frac{\partial g_n}{\partial y_1}&\cdots&\cdots&\frac{\partial g_n}{\partial y_n}
\end{pmatrix}
\end{eqnarray}
It is written as.
From here, the conversion formula of the probability density function
\begin{eqnarray}
f_Y\left(y_1,y_2,\cdots ,y_n\right)
= f_X\left(y_1,y_2,\cdots ,y_n\right)\left|\frac{\partial\left( x_1,x_2,\cdots x_n\right) }{\partial\left(y_1,y_2,\cdots y_n\right) }\right|
\end{eqnarray}
To get.
The $ 1 $ dimensional standard normal distribution is written as $ N \ left (0,1 \ right) $.
$ N $ standard normal distribution independent of each other
\begin{eqnarray}
Z_1,\ Z_2,\cdots ,\ Z_n \sim N\left(0,1\right)\ \textrm{i.i.d}
\end{eqnarray}
To determine. These simultaneous density functions are expressed as products because they are independently distributed.
\begin{eqnarray}
f_Z\left(z\right) = \frac{1}{\left(2\pi\right)^{n/2}} \exp\left(-\frac{1}{2}z^Tz\right)
\end{eqnarray}
Will be.
here,
\begin{eqnarray}
X &=& AZ + \mu
\end{eqnarray}
Change the variables of. However, $ Z and X $ are $ n $ dimensional random variable vectors, $ A $ is a $ n \ times n $ matrix, and $ \ mu $ is a $ n $ dimensional vector.
\begin{eqnarray}
X&=&\left(X_1,\ X_2,\ \cdots\ X_n\right)^{\mathrm{T}} \nonumber \\
Z&=&\left(Z_1,\ Z_2,\ \cdots\ Z_n\right)^{\mathrm{T}} \nonumber \\
A&=&
\begin{pmatrix}
a_{1,1}&a_{1,2}&\cdots&a_{1,n}\\
a_{2,2}&\ddots&&\vdots\\
\vdots&&\ddots&\vdots\\
a_{n,1}&\cdots&\cdots&a_{n,n}
\end{pmatrix} \nonumber \\
\mu&=&\left(\mu_1,\ \mu_2,\ \cdots\ \mu_n\right)^{\mathrm{T}} \nonumber
\end{eqnarray}
Jacobian of change of variables
\begin{eqnarray}
\left|\frac{\partial\left( x_1,x_2,\cdots x_n\right) }{\partial\left(z_1,z_2,\cdots z_n\right) }\right| &=& \det\left( A\right)
\end{eqnarray}
Than,
\begin{eqnarray}
\left|\frac{\partial\left( z_1,z_2,\cdots z_n\right) }{\partial\left(x_1,x_2,\cdots x_n\right) }\right| &=& \frac{1}{\det\left( A\right)}
\end{eqnarray}
To get.
Therefore, the simultaneous density function of $ X $ is
\begin{eqnarray}
f_X\left(x\right)&=&f_Z\left(A^{\mathrm{T}}\left(x-\mu\right)\right)\frac{1}{\det\left(A\right)} \nonumber \\
&=&\frac{1}{\left(2\pi\right)^{n/2}\det\left(A\right)}\exp\left\{-\frac{1}{2}\left(x-\mu\right)^{\mathrm{T}}AA^{\mathrm{T}}\left(x-\mu\right)\right\}
\end{eqnarray}
Will be.
The average vector of the transformed random variables and the variance-covariance matrix are
\begin{eqnarray}
E\left[X\right]&=&AE\left[Z\right]+\mu \nonumber \\
&=&\mu
\\
Cov\left(X\right)&=&E\left[\left(X-\mu\right)\left(X-\mu\right)^{\mathrm{T}}\right] \nonumber \\
&=&E\left[AZZ^{\mathrm{T}}A^{\mathrm{T}}\right] \nonumber \\
&=&AE\left[Z\right] E\left[Z^{\mathrm{T}}\right] A^{\mathrm{T}} \nonumber \\
&=&AA^{\mathrm{T}}
\end{eqnarray}
The covariance matrix,
\begin{eqnarray}
\Sigma = AA^{\mathrm{T}}
\end{eqnarray}
Notated as.
Applying this, the simultaneous density function of the multidimensional normal distribution
\begin{eqnarray}
f_X\left(x\right)
&=&\frac{1}{\left(2\pi\right)^{n/2}\det\left(A\right)}\exp\left\{-\frac{1}{2}\left(x-\mu\right)^{\mathrm{T}}\left(AA^{\mathrm{T}}\right)^{-1}\left(x-\mu\right)\right\} \nonumber \\
&=&\frac{1}{\left(2\pi\right)^{n/2}\det\left(\Sigma\right)^{1/2}}\exp\left\{-\frac{1}{2}\left(x-\mu\right)^{\mathrm{T}}\Sigma^{-1}\left(x-\mu\right)\right\}
\end{eqnarray}
To get.
In generating a random number sequence from the variance-covariance matrix by Cholesky decomposition, a constraint condition that the variance-covariance matrix is a definite value is required.
Here, we generally confirm the properties of the variance-covariance matrix and confirm the Cholesky decomposition possibility.
Here, we confirm our knowledge of linear algebra.
The following are equivalent. -$ n \ times n $ Symmetric matrix $ A $ is a semi-fixed value --For any vector $ x $, $ x ^ \ mathrm {T} Ax \ geq 0 $ -Any major submatrix expression of $ A $ is non-negative --All eigenvalues are non-negative
The following are equivalent. -$ n \ times n $ Symmetric matrix $ A $ is a definite value --For any vector $ x $, $ x ^ \ mathrm {T} Ax> 0 $ -Any major submatrix expression of $ A $ is positive --All eigenvalues are positive
Prove the following allegations. ** The variance-covariance matrix $ \ Sigma $ is a semi-regular definite matrix. ** **
Let $ n \ times n $ variance-covariance matrix with variance $ \ sigma_i $ and covariance $ \ rho_ {i, j} $ as components be $ \ Sigma $. From $ \ rho_ {i, j} = \ rho_ {j, i} $, $ \ Sigma $ is a symmetric matrix. Let $ a $ be an arbitrary $ n $ dimensional constant vector.
\begin{eqnarray} a^{\mathrm{T}}\Sigma a &=& a^{\mathrm{T}}E\left[\left(X-E\left[X\right]\right)\left(X-E\left[X\right]\right)^{\mathrm{T}}\right]a \nonumber \ &=&E\left[a^{\mathrm{T}}\left(X-E\left[X\right]\right)\left(X-E\left[X\right]\right)^{\mathrm{T}}a\right] \nonumber \ &=&E\left[\langle a,X-E\left[X\right]\rangle^2\right] \nonumber \ &\geq&0 \nonumber \end{eqnarray}
> Therefore, $ \ Sigma $ is a definite matrix.
In practice, the variance-covariance matrix is expected to be definite-matrix symmetry in many cases,
Note that the following random numbers cannot be generated if the definite value is not symmetric.
## 1.4. Cholesky decomposition
Cholesky decomposition is the decomposition of the definite matrix $ A $ using the lower triangular matrix $ L $.
```math
\begin{eqnarray}
A&=&LL^{\mathrm{T}}
\\
\nonumber \\
A&=&
\begin{pmatrix}
a_{1,1}&a_{1,2}&\cdots&a_{1,n}\\
a_{2,2}&\ddots&&\vdots\\
\vdots&&\ddots&\vdots\\
a_{n,1}&\cdots&\cdots&a_{n,n}
\end{pmatrix} \nonumber \\
L&=&
\begin{pmatrix}
l_{1,1}&0&\cdots&\cdots&0\\
l_{2,1}&l_{2,2}&0&&\vdots\\
\vdots&&\ddots&\ddots&\vdots\\
\vdots&&&l_{n-1,n-1}&0\\
l_{n,1}&\cdots&\cdots&l_{n,n-1}&l_{n,n}
\end{pmatrix} \nonumber
\end{eqnarray}
The construction method of Cholesky decomposition is as follows.
\begin{eqnarray}
l_{i,i}&=&\sqrt{a_{i,i}-\sum_{k=1}^{i-1}l_{i,k}^2}\ \ \ \ \ \ \left(\textrm{for}\ i=1,\cdots ,n\right) \\
l_{j,i}&=&\left(a_{j,i}-\sum_{k=1}^{i-1}l_{j,k}\ l_{i,k}\right)/l_{i,i}\ \ \ \left(\textrm{for}\ j=i+1,\cdots ,n\right)
\end{eqnarray}
A definite matrix is set for the variance-covariance matrix $ \ Sigma $ whose components are the variance $ \ sigma_i $ and the covariance $ \ rho_ {i, j} $.
Apply the Cholesky decomposition to $ \ Sigma $ to get $ A = L $ (lower triangular matrix).
\begin{eqnarray}
x&=&Lz+\mu \\
\nonumber \\
L&=&
\begin{pmatrix}
l_{1,1}&0&\cdots&\cdots&0\\
l_{2,1}&l_{2,2}&0&&\vdots\\
\vdots&&\ddots&\ddots&\vdots\\
\vdots&&&l_{n-1,n-1}&0\\
l_{n,1}&\cdots&\cdots&l_{n,n-1}&l_{n,n}
\end{pmatrix} \nonumber \\
l_{i,i}&=&\sqrt{\sigma_{i}-\sum_{k=1}^{i-1}l_{i,k}^2}\ \ \ \ \ \ \left(\textrm{for}\ i=1,\cdots ,n\right) \nonumber \\
l_{j,i}&=&\left(\rho_{j,i}-\sum_{k=1}^{i-1}l_{j,k}\ l_{i,k}\right)/l_{i,i}\ \ \ \left(\textrm{for}\ j=i+1,\cdots ,n\right) \nonumber
\end{eqnarray}
multivariate_normal_distribution.py
import numpy as np
#average
mu = np.array([[2],
[3]])
#Covariance matrix
cov = np.array([[1,0.2],
0.2,1]])
L = np.linalg.cholesky(cov)
dim = len(cov)
random_list=[]
for i in range(n):
z = np.random.randn(dim,1)
random_list.append((np.dot(L,z)+mu).reshape(1,-1).tolist()[0])
I played around with it.
(The numbers represent the elements of the covariance matrix.)
Recommended Posts