This is a sequel to the series of statistics and visualization.
The chi-square distribution is a distribution often used in the chi-square test of the AB test. It is chi-square by writing $ \ chi ^ 2 $. The graph shows the following shape, and the shape changes according to the value of k, which is called the degree of freedom.
(The graph drawing code is here)
When I asked Wikipedia teacher about the definition of chi-square distribution,
Independently take $ k $ random variables $ X_1, ..., X_k $ that follow a standard normal distribution. At this time, the distribution according to the statistic
$ Z = \ sum_ {i = 1} ^ k X_i ^ 2 $ is called the chi-square distribution with $ k $ degrees of freedom.
I got a reply. Hmm, what do you mean? Do you square the density function of the normal distribution? Apparently it's different.
First of all, since it is "$ k $ random variables that independently follow the standard normal distribution", I will first write a histogram of random numbers that follow the standard normal distribution. Random numbers according to 30,000 $ X \ sim \ N (\ mu, \ sigma) $.
x = np.random.normal(0, 1, 30000)
plot_dist(x, bins=80, title="normal distribution.")
(The full code for drawing the graph is here)
The distribution that the random numbers plotted by squaring this random number follow is the chi-square distribution. In code
#30 random numbers that follow a standard normal distribution,Generate 000
x = np.random.normal(0, 1, 30000)
#Square the generated random number [[This is the key! !! !! ]]
x2 = x**2
#Histogram drawing
plt.figure(figsize=(7,5))
plt.title("chi2 distribution.[k=1]")
plt.hist(x2, 80, color="lightgreen", normed=True)
#Drawing a chi-square distribution with one degree of freedom
xx = np.linspace(0, 25 ,1000)
plt.plot(xx, chi2.pdf(xx, df=1), linewidth=2, color="r")
It will be. The display of this graph is as follows. Since it is squared, everything is positive, so all the data has moved to the right from $ x = 0 $, Because it is squared
At the same time, the line of the density function of the chi-square distribution with 1 degree of freedom is drawn, but they are almost the same! This is plotting $ X_1 ^ 2 $ because it squares a random number that follows a standard normal distribution and plots it as is. Since there is only one $ X $, it has a chi-square distribution with one degree of freedom.
Then, if you draw an animation from
This is also a perfect match! The "square" of chi-square can be interpreted as the square of a random number that follows a standard normal distribution! I was able to add an image to this image by writing a histogram!
Below is the code for drawing the animation of the graph with 1 to 10 degrees of freedom.
def animate(nframe):
n = 30000
k = nframe + 1
cum = np.zeros(n)
for i in range(k):
#30 random numbers that follow a standard normal distribution,Generate 000
x = np.random.normal(0, 1, n)
#Square the generated random number [This is the key! ]
x2 = x**2
#The added number is the degree of freedom.
cum += x2
#Histogram drawing
plt.clf()
#plt.figure(figsize=(9,7))
plt.ylim(0, 0.6)
plt.xlim(0, 25)
plt.title("chi2 histgram & pdf [k=%d]"%k)
plt.hist(cum, 80, color="lightgreen", normed=True)
#Drawing a chi-square distribution with one degree of freedom
xx = np.linspace(0, 25 ,1000)
plt.plot(xx, chi2.pdf(xx, df=k), linewidth=2, color="r")
fig = plt.figure(figsize=(10,8))
anim = ani.FuncAnimation(fig, animate, frames=10, blit=True)
anim.save('chi2_hist_dist.gif', writer='imagemagick', fps=1, dpi=64)
Since imagemagick is used to draw gif animation, Honke HP and PythonMagick Please install by referring to download / python /).
However, installing ImageMagick and PythonMagick is difficult depending on the environment, so if you just want to create animations easily, you can generate animations with mp4 as shown below without additional libraries.
anim.save('filename.mp4', fps=13)
Recommended Posts