[PYTHON] Understanding the meaning of complex and bizarre normal distribution formulas

The normal distribution plays a very important role in statistics. The mathematical formula (density function) that expresses the normal distribution is


\Phi(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp \left( -\frac{(x-\mu)^2}{2\sigma^2} \right)

But this is a very complicated formula ...

It looks a little easier with a standard normal distribution with variance $ \ sigma ^ 2 = 1 $ and mean $ \ mu = 0 $.


\phi(x) = \frac{1}{\sqrt{2\pi}}\exp \left( -\frac{x^2}{2} \right)

The graph looks like this. It's a so-called bell type. (Hereafter, I will draw a graph appropriately with python)

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-4,4, 100)
y = (1/np.sqrt(2*np.pi))*np.exp(-x**2/2)

plt.ylim(0,0.45)
plt.plot(x,y)
plt.show()

normdist0.png

In the first place, the normal distribution is smooth and symmetrical, and I think the origin is that we want to express the probability with a function that is gathered at one point. So take the plunge and quadratic function


f(x) = x^2

If you draw it in a graph as

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-10,10, 100)
y = x**2

plt.plot(x,y)
plt.show()

normdist.png

Hmmm, I'm sick. If this is the case, it will not be distributed, so multiply it by minus and turn it upward.


f(x) = -x^2

x = np.linspace(-1,1, 100)
y = -x**2

plt.xlim(-1.2,1.2)
plt.ylim(-1,0.2)
plt.plot(x,y)
plt.show()

normdist3.png

It's getting a little like that. To extend the hem and make it bell-shaped, you can ride it on $ e $.


f(x) = e^{-x^2}
x = np.linspace(-1,1, 100)
y = np.exp(-x**2)

plt.xlim(-1.5,1.5)
plt.ylim(0,1.2)
plt.plot(x,y)
plt.show()

normdist4.png

The shape is now perfectly normal. The origin of the form of this normal distribution was $ e ^ {-x ^ 2} $.

After that, $ x $ is $ 1 / \ sqrt {2} $ so that it can be calculated easily when differentiated. That is, change the variable as $ y = \ sqrt {2} x $.


g(y) = \exp \left(-\frac{y^2}{2} \right)
x = np.linspace(-3,3, 500)
y1 = np.exp(-(x**2))
y2 = np.exp(-(x**2)/2)

plt.xlim(-3,3)
plt.ylim(0,1.1)
plt.plot(x,y1,"b", label="exp(-x^2)")
plt.plot(x,y2,"g", label="exp(-(x^2)/2")
plt.legend()
plt.show()

normdist5.png

It spread a little sideways.

It is necessary to integrate so that the area of this f (x) becomes 1 (because it is a probability, and if all possible events are added, it becomes 100%).

From [Gauss integral (see Wikipedia)](http://ja.m.wikipedia.org/wiki/Gauss integral), the integrated value for the entire range of $ x $ is


\int_{-\infty}^{\infty} e^{-x^2}dx = \sqrt{\pi}

So if you apply the change of variables $ y = \ sqrt {2} x $


\int_{-\infty}^{\infty} \exp \left( {-\frac{y^2}{2}}\right)dy = \sqrt{2\pi}

is. Divide both sides by $ \ sqrt {2 \ pi} $ and


\int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} \exp \left( {-\frac{y^2}{2}}\right)dy = 1

I got the formula for the standard normal distribution: blush:


x = np.linspace(-1,1, 100)
y = np.exp(-(x**2)/2)

plt.xlim(-1.5,1.5)
plt.ylim(0,1.2)
plt.plot(x,y)
plt.show()

normdist6.png

This formula was adjusted based on $ e ^ {-x ^ 2} $ so that it would be 1 when integrated to obtain the area.

Recommended Posts

Understanding the meaning of complex and bizarre normal distribution formulas
Full understanding of the concepts of Bellman-Ford and Dijkstra
Explain the nature of the multivariate normal distribution graphically
Defeat the probability density function of the normal distribution
Organize the meaning of methods, classes and objects
The meaning of self
Verification of normal distribution
Plot and understand the multivariate normal distribution in Python
Steps to calculate the likelihood of a normal distribution
Check the type and version of your Linux distribution
Relationship and approximation error of binomial distribution, Poisson distribution, normal distribution, hypergeometric distribution
[Statistics] Let's visualize the relationship between the normal distribution and the chi-square distribution.
Install and manage multiple environments of the same distribution on WSL
Carefully derive the interquartile range of the standard normal distribution from the beginning
The story of Python and the story of NaN
Test the goodness of fit of the distribution
Understanding and implementing the Tonelli-Shanks algorithm (2)
[Python] Understanding the potential_field_planning of Python Robotics
The meaning of ".object" in Django
Understanding and implementing the Tonelli-Shanks algorithm (1)
I tried to visualize the age group and rate distribution of Atcoder
[Python] Note: A self-made function that finds the area of the normal distribution
[Understanding in 3 minutes] The beginning of Linux
This and that of the inclusion notation.
A rough understanding of python-fire and a memo
Meaning of deep learning models and parameters
Review the concept and terminology of regression
The story of trying deep3d and losing
About the Normal Equation of Linear Regression
Solving the complex gain of interferometer observations
Full understanding of Python threading and multiprocessing