Derivation of multivariate t distribution and implementation of random number generation by python

Introduction

Since I was investigating the multivariate t distribution, I will summarize it.

(The moment I wrote this article, I blew away the data and rewrote it ... I want to die ...)

Derivation of multivariate t distribution

Definition

Follow the notation of Multivariate t-distribution on wikipedia because Gugu is the first to be found. The probability density function for the multivariate t distribution is defined below.

p(\textbf{x}) = \frac{\Gamma\Bigl[{\frac{\nu+p}{2}}\Bigr]}{\Gamma\Bigl[{\frac{\nu}{2}}\Bigr] \nu^{\frac{p}{2}}\pi^{\frac{p}{2}} \det(\Sigma)^{\frac{1}{2}}} \Bigl[ 1 + \frac{1}{\nu} (\textbf{x} - {\bf \mu})^{T} {\bf \Sigma}^{-1}(\textbf{x} - {\bf \mu})\bigr]^{\frac{-(\nu+p)}{2}}. \tag{1}

theory

point

In the first place, what is the t distribution? It can be said that the t distribution is the one that considers the parameters that increase or decrease the variance of the normal distribution, assumes the gamma distribution, and eventually eliminates the integral. Here, we may assume an inverse gamma distribution and a chi-square distribution instead of a gamma distribution, but with a slight change, they are all likely to be equivalent. (I haven't tried)

This allows you to see the average behavior when you are not sure about the value of the variance.

Derivation

Consider two random variables $ V \ sim Gam (\ alpha, \ beta) $, $ X \ sim N (\ textbf {0}, {\ bf \ Sigma}) $. $ X $ is a $ d $ dimensional normal distribution. These two random variables are converted into $ Y and U $ by the functions $ f and g . $ Y = f(V,X) = \mu + \frac{1}{\sqrt{V}}X, \tag{2} $ $ U = g(V,X) = g(V)= V. \tag{3} $$

Here, $ g $ is an identity function of $ V $, $ \ mu \ in \ mathbb {R} ^ {d} $ is a d-dimensional vector representing the mean, and $ V $ is independent of $ X $. I will assume.

This $ f (\ cdot) $ is a straight line (plane) function when $ V $ is fixed, and is a one-to-one correspondence function. Therefore, the inverse function $ f ^ {-1} (\ cdot) $ exists, and $ f^{-1}(Y) = X = \sqrt{U}(Y - \mu), \tag{4} $ It becomes.

Find the distribution of $ Y $ from $ V and X $ whose distribution is already known. Since there are two random variables, $ U $ and $ Y $, the distribution of $ Y $ is obtained by integrating and eliminating $ U $ by considering the joint distribution of $ U and Y $.

Pr(u,y) = Pr(U < u, Y < y) = Pr(g(V) < u,f(X) < y) \\
= Pr(V < g^{-1}(u), X < f^{-1}(y)) \\
= Pr(V < u, X<\sqrt{u}(y - \mu)) = Pr(V < u) Pr(X < \sqrt{u}(y - \mu)). \tag{5}

Here, $ V $ and $ X $ are assumed to be independent, $ Pr(v,x) = Pr(v)Pr(x), \tag{6} $ I used.

Let $ p_ {V} and p_ {X} $ be the probability density functions of $ V $ and $ X , respectively. $ Pr(u,y) = Pr(V < u) Pr(X < \sqrt{u}(y - \mu)) = \int_{-\infty}^{u} \int_{-\infty}^{y} p_{V}(u)p_{X} (\sqrt{u}(y - \mu)) |\det(J)| dudy. \tag{7} $$

here,|\det(J)|Is the Jacobian generated by the change of variables, and the Jacobian matrixJ(d+1)*(d+1)Matrix)

J = \begin{pmatrix} \frac{\partial x}{\partial y^{T}} & \frac{\partial x}{\partial u} \\ \frac{\partial v}{\partial y^{T}} & \frac{\partial v}{\partial u} \end{pmatrix} =
 \begin{pmatrix} \sqrt{u}I_{d} & \frac{1}{2\sqrt{u}(y-\mu)} \\ \textbf{0} & 1 \end{pmatrix}. \tag{8}

Since the Jacobian matrix $ J $ is an upper triangular matrix, its determinant is the product of diagonal elements. $ \det(J) = u^{\frac{d}{2}}. \tag{9} $    Since the content of the integral of $ (7) $ is the probability density function $ p_ {U, Y} (u, y) $ of $ Pr (u, y) $, this is calculated. From $ V \ sim Gam (\ alpha, \ beta) $, $ X \ sim N (\ textbf {0}, {\ bf \ Sigma}) $ $ p_{V}(v) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} v^{\alpha-1} \exp{( - \beta v)}, \tag{10} $ $ p_{X}(\textbf{x}) = \frac{1}{(2\pi)^{\frac{d}{2}}} \frac{1}{\sqrt{\det(\Sigma)}} \exp{\Bigl( - \frac{1}{2} \textbf{x}^{T} \Sigma^{-1} \textbf{x} \Bigr)}, \tag{11} $ Therefore, $ p_{V}(u) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} u^{\alpha-1} \exp{( - \beta u)} = Gam(\alpha, \beta), \tag{12} $

p_{X}(\sqrt{u}(\textbf{y} - \mu)) = \frac{1}{(2\pi)^{\frac{d}{2}}} \frac{1}{\sqrt{\det(\Sigma)}} \exp{\Bigl( - \frac{1}{2} \sqrt{u}(\textbf{y} - \mu)^{T} \Sigma^{-1} \sqrt{u}(\textbf{y} - \mu) \Bigr)} \\
= \frac{1}{(2\pi)^{\frac{d}{2}}} u^{-\frac{d}{2}} u^{\frac{d}{2}} \frac{1}{\sqrt{\det(\Sigma)}} \exp{\Bigl( - \frac{1}{2} (\textbf{y} - \mu)^{T} (u \Sigma^{-1}) (\textbf{y} - \mu) \Bigr)} \\
=  u^{-\frac{d}{2}} \frac{1}{(2\pi)^{\frac{d}{2}}}  \frac{1}{\sqrt{\det(\frac{\Sigma}{u})}} \exp{\Bigl( - \frac{1}{2} (\textbf{y} - \mu)^{T} (u \Sigma^{-1}) (\textbf{y} - \mu) \Bigr)} = u^{-\frac{d}{2}} N(\mu, \frac{\Sigma}{u}). \tag{13}

The probability density function $ p_ {U, Y} (u, y) $ of $ Pr (u, y) $ is $ p_{U,Y}(u,y) = Gam(\alpha, \beta) u^{-\frac{d}{2}} N(\mu, \frac{\Sigma}{u}) |\det(J)| = Gam(\alpha, \beta) N(\mu, \frac{\Sigma}{u}) u^{-\frac{d}{2}}u^{\frac{d}{2}} = Gam(\alpha, \beta) N(\mu, \frac{\Sigma}{u}). \tag{14} $ (Here, the gamma distribution platform is $ u \ in [0, \ infinty), d> 0 $, so the absolute value may be removed. )

By integrating $ p_ {U, Y} (u, y) $ with respect to $ u $, we get the probability density function $ p_ {Y} $ of $ Y . $ p_{Y} = \int_{0}^{\infty} p_{U,Y}(u,y)du = \int_{0}^{\infty} Gam(\alpha, \beta) N(\mu, \frac{\Sigma}{u})du \tag{15} $$

Here, for the sake of simplification of the notation, the distance of the normal distribution Mahalanobis is set to $ \ Delta ^ {2} = (\ textbf {y}-\ mu) ^ {T} \ Sigma ^ {-1} (). It is written as textbf {y}-\ mu) $.

p_{Y}(\textbf{y}) = \int_{0}^{\infty} \frac{\beta^{\alpha}}{\Gamma(\alpha)} u^{\alpha-1} \exp{( - \beta u)} u^{-\frac{d}{2}} \frac{1}{(2\pi)^{\frac{d}{2}}} \frac{1}{\sqrt{\det(\frac{\Sigma}{u})}} \exp{\Bigl( - \frac{u}{2} \Delta^{2} \Bigr)} du. \tag{16}

Extract only the part that depends on $ u $ $ p_{Y}(\textbf{y}) \propto \int_{0}^{\infty} u^{\alpha-1} \exp{( - \beta u)} u^{\frac{d}{2}} \exp{\Bigl( - \frac{u}{2} \Delta^{2} \Bigr)} du \\ = \int_{0}^{\infty} u^{\frac{d}{2}+\alpha-1} \exp{\Bigl( - u( \beta + \frac{\Delta^{2}}{2} ) \Bigr)} du. \tag{17} $

Here, if $ z = u (\ beta + \ frac {\ Delta ^ {2}} {2}) $, then $ u = z (\ beta + \ frac {\ Delta ^ {2}} {2 }) ^ {-1} $, and $ du = (\ beta + \ frac {\ Delta ^ {2}} {2}) ^ {-1} dz $. When $ u = 0 $, $ z = 0 $, $ u \ to \ infinity $, and $ z \ to \ infinty $.

\int_{0}^{\infty} z^{\frac{d}{2} + \alpha -1} ( \beta + \frac{\Delta^{2}}{2} )^{-\frac{d}{2} -\alpha +1} \exp(-z) (\beta + \frac{\Delta^{2}}{2})^{-1} dz \\ = ( \beta + \frac{\Delta^{2}}{2} )^{-\frac{d}{2} -\alpha} \int_{0}^{\infty} z^{(\frac{d}{2} + \alpha) -1} \exp(-z) dz.

From the definition of the gamma function $ p_{Y}(\textbf{y}) \propto \Bigl( \beta + \frac{\Delta^{2}}{2} \Bigr)^{-\frac{d}{2} -\alpha} \Gamma \Bigl( \frac{d}{2} + \alpha \Bigr) \tag{18} $

Considering the $ u $ -independent term of $ (17) $

p_{Y}(\textbf{y}) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} \frac{1}{(2\pi)^{\frac{d}{2}}}  \frac{1}{\sqrt{\det(\Sigma)}} \Bigl( \beta + \frac{\Delta^{2}}{2} \Bigr)^{-\frac{d}{2} -\alpha}  \\
= \frac{\Gamma \Bigl( \frac{d}{2} + \alpha \Bigr)}{\Gamma(\alpha)}\frac{\beta^{\alpha}}{(2\pi)^{\frac{d}{2}}} \frac{1}{\sqrt{\det(\Sigma)}} \Bigl( \beta + \frac{\Delta^{2}}{2} \Bigr)^{-\frac{d}{2} -\alpha} \Bigl( \frac{1}{\beta} \Bigr)^{\frac{d}{2} + \alpha} \beta^{-\frac{d}{2}-\alpha} \\
= \frac{\Gamma \Bigl( \frac{d}{2} + \alpha \Bigr)}{\Gamma(\alpha)}\frac{1}{(2 \beta \pi)^{\frac{d}{2}}} \frac{1}{\sqrt{\det(\Sigma)}} \Bigl( 1 + \frac{\Delta^{2}}{2 \beta} \Bigr)^{-\frac{d}{2} -\alpha}. \tag{19}

Here, if $ \ alpha = \ beta = \ frac {\ nu} {2} $ and $ p = d $ are set, $ p_{Y}(\textbf{y}) = \frac{\Gamma \Bigl( \frac{\nu+p}{2} \Bigr)}{\Gamma \Bigl( \frac{\nu}{2} \Bigr)}\frac{1}{(\nu \pi)^{\frac{p}{2}}} \frac{1}{\sqrt{\det(\Sigma)}} \Bigl( 1 + \frac{\Delta^{2}}{\nu} \Bigr)^{-\frac{(\nu+p)}{2}}, \tag{20} $ Is in agreement with the definition of the t distribution of $ (1) $. $ \ Box $

Practice

Implementation of random number generation by Python

Since the $ (2) $ expression is a random number generation method as it is, implement this in python.

# -*- coding: utf-8 -*-
#!/usr/bin/python

import numpy as np
import matplotlib.pyplot as plt

#Set the parameters.
N = 5000
d = 10
df = 4

mean = np.zeros(d)
cov = np.eye(d)

#Random number generation from gamma distribution
V = np.random.gamma(shape=df/2., scale=df/2.,size=N)

#Random number generation from multivariate normal distribution
X = np.random.multivariate_normal(mean=mean, cov=cov, size=N)

#Convert X.
V = V.reshape((N,1))
denom = np.sqrt(np.tile(V,d))
Y = mean + X / denom


#Generate random numbers from a multivariate normal distribution for comparison.
#(Question:Is it okay for comparison to look like this? )
cov_ = ((df - 2) / float(df)) * np.corrcoef(Y.T)
X_ = np.random.multivariate_normal(mean=mean, cov=cov_, size=N)

#Plot.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(X_[:,0], color="blue", normed=True, label="Normal dist", bins=30)
ax.hist(Y[:,0], color="red", normed=True, label="t dist", bins=30, alpha=0.5)
ax.set_title("Comparison of normal and t distribution")
ax.legend()
fig.show()

fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(X_[:,0], X_[:,1],color="blue",label="Normal dist")
ax.scatter(Y[:,0], Y[:,1],color="red",label="t dist",alpha=0.5)
ax.set_title("scatter plot of t and normal dist")
ax.legend()
fig.show()

figure_1.png

figure_2.png

The variance is large and it looks like a t distribution? (I wanted to compare other than one dimension at a time, but I rewrote it and exhausted it, so this is about it ...)

Finally

I'm an atmosphere math man, so please let me know if you make a mistake. If I have time, I would like to follow the connection of various distributions.

Recommended Posts

Derivation of multivariate t distribution and implementation of random number generation by python
Anomaly detection introduction and method summary
Derivation of multivariate t distribution and implementation of random number generation by python
Anomaly detection using MNIST by Autoencoder (PyTorch)
ECG data anomaly detection by Matrix Profile
[python] Random number generation memorandum
Random number generation according to multivariate normal distribution using Cholesky decomposition of covariance matrix
Prime number generation program by Python
Pseudo-random number generation and random sampling
Random number generation summary by Numpy
[Python] Implementation of Nelder–Mead method and saving of GIF images by matplotlib
PRML Chapter 2 Student's t Distribution Python Implementation
[Python] Comparison of Principal Component Analysis Theory and Implementation by Python (PCA, Kernel PCA, 2DPCA)
Divides the character string by the specified number of characters. In Ruby and Python.
Implementation of TRIE tree with Python and LOUDS
Cholesky decomposition was behind the random number generation according to the multivariate normal distribution
Explanation of edit distance and implementation in Python
Python> Sort by number and sort by alphabet> Use sorted ()
Sequential update of covariance to derivation and implementation of expressions
Plot and understand the multivariate normal distribution in Python
Implementation of DB administrator screen by Flask-Admin and Flask-Login
Overview of generalized linear models and implementation in Python
Random string generation (Python)
Python implementation of CSS3 blend mode and talk of color space
Derivation and implementation of update equations for non-negative tensor factorization
Mass generation of QR code with character display by Python
Explanation and implementation of SocialFoceModel
Python implementation of particle filters
Python implementation mixed Bernoulli distribution
Maxout description and implementation (Python)
Implementation of quicksort in Python
Source installation and installation of Python
I checked the distribution of the number of video views of "Flag-chan!" [Python] [Graph]
Count the number of Thai and Arabic characters well in Python
Automatic acquisition of gene expression level data by python and R
Crawling with Python and Twitter API 2-Implementation of user search function
[Python] I thoroughly explained the theory and implementation of logistic regression
[Python] I thoroughly explained the theory and implementation of decision trees
Practice of data analysis by Python and pandas (Tokyo COVID-19 data edition)
Comparison of calculation speed by implementation of python mpmath (arbitrary precision calculation) (Note)
Implementation example of hostile generation network (GAN) by keras [For beginners]
[Scientific / technical calculation by Python] Derivation of analytical solutions for quadratic and cubic equations, mathematical formulas, sympy