This series is a brief explanation of "Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa, and let's implement the contents in python. I used Google Colaboratory (hereinafter referred to as Colab) for implementation. If you have any suggestions, I would appreciate it if you could write them in the comment section. It may not be suitable for those who want to understand all the contents of the book properly because it is written with a stance that it would be nice if it could be output by touching only the part that I thought needed explanation. Please note that if the formula numbers and proposition / definition indexes are written according to the book, the numbers may be skipped in this article.
Probability distribution was the function for which the probability can be obtained by giving a variable. Each of the various types of probability distributions has its own characteristics and uses. It is important to know what the characteristics of each probability distribution are, because if you make a mistake in the assumed probability distribution, you will make a mistake. You can find the expected value and variance of the probability distribution using the probability generation function, product factor generation function, and characteristic function in the previous chapter, but I think you should remember. You may remember it while using it. At the end of the chapter we touch on Stein's equations and Stirling's formula. If you google, you will find many probability distributions that are not introduced in the article. I will write an article on "Probability generating function, Moment generating function, Characteristic function" at another time to prove the proposition using the probability generating function, so I would like to introduce it at that time.
$ $
We dealt with expected value and variance in Chapter 2, but did not touch on the relationship between expected value and variance. Let $ E [X] = \ mu 
$ $ Before the binomial distribution, let me explain the Bernoulli trial. Let me quote the expression in the book,
A Bernoulli trial is an experiment in which a $ p $ probability of'success', a $ 1-p $ probability of'failure', and a random variable $ X $ is'successful', $ 1 $,' Take $ 0 $ on failure'.
The binomial distribution is a distribution in which the variable $ X $ is the "number of'successes'" when this Bernoulli trial is performed independently (the previous trial does not affect the next trial) $ n $. The probability of failing $ k $ times and failing $ nk $ times is expressed by the following formula ('success','failure' is a binary opposition such as'get sick','do not get', etc. Anything you do).
As an example, let's draw the probability distribution of the number of times the table appears when the coin is thrown 30 times and 1000 times.
The Poisson distribution is a special case of the binomial distribution, and when "rare phenomena" can be "observed (tried) in large numbers" (example: distribution of the number of traffic accidents that occur in one day), the binomial distribution Use the Poisson distribution instead. In other words, if we take the limit of 
Now, let's check the binomial distribution and Poisson distribution with python.
%matplotlib inline
import matplotlib.pyplot as plt
from scipy.special import comb#Function to calculate the combination
import pandas as pd
#Graph drawing of binomial distribution
def Bin(n,p,x_min,x_max,np):
  prob = pd.Series([comb(float(n),k)*p**k*(1-p)**(float(n)-k) for k in range(0,n+1)]) #Calculate the probability at each k
  plt.bar(prob.index,prob,label=np)#Bar graph (y value,x value)
  plt.xlim(x_min,x_max)
  plt.legend()
  plt.show()
Bin(1000,0.5,0,30,"n=30,p=0.5")#30 coins
Bin(10000,0.5,4500,5500,"n=1000,p=0.5")#1000 coins
Bin(40000,0.00007,0,15,"n=40000,p=0.00007")#Try increasing n and decreasing p
If you do this, you will get the following three graphs.
 
 

How about the same function, but with a little distortion, you could draw something like a Poisson distribution.
The remaining three discrete probability distributions also have their own unique ideas, but I think you can read them if you are aware of what the discrete random variable $ X $ represents.
The continuous distribution introduced in the book is as follows ・ Uniform distribution ·normal distribution ・ Gamma distribution, chi-square distribution ・ Exponential distribution, hazard distribution ・ Beta distribution Let's pick it up here as well.
$ $ 
The normal distribution is the most important probability distribution because it has a symmetrical shape centered on the mean and is easy to handle.
When the random variable $ X $ follows a normal distribution with mean $ \ mu, $ variance $ \ sigma ^ 2 $, the probability density function of $ X $ is 
$ $ There is a chi-square distribution as a special case of the gamma distribution, but the chi-square distribution is more important in statistics. As we will see in later chapters, the chi-square distribution is used for interval estimation of population variance, goodness-of-fit test, independence test, and so on. As for the chi-square distribution, the properties that appear in Chapters 4 and 5 are more important than the formula expressed using the gamma function, so only the shape of the chi-square distribution is drawn here. The chi-square distribution with $ n $ degrees of freedom is represented by $ \ chi_n ^ 2 $. I will omit the degree of freedom because it will be better understood in the following chapters.
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
x1 = np.arange(0,15,0.1)
y1 = stats.chi2.pdf(x=x1,df=1)#df=degree of freedom(Degree of freedom)is
y2 = stats.chi2.pdf(x=x1,df=2)
y3 = stats.chi2.pdf(x=x1,df=3)
y4 = stats.chi2.pdf(x=x1,df=5)
y5 = stats.chi2.pdf(x=x1,df=10)
y6 = stats.chi2.pdf(x=x1,df=12)
plt.figure(figsize=(7,5))
plt.plot(x1,y1, label='n=1')
plt.plot(x1,y2, label='n=2')
plt.plot(x1,y3, label='n=3')
plt.plot(x1,y4, label='n=5')
plt.plot(x1,y5, label='n=10')
plt.plot(x1,y6, label='n=12')
plt.ylim(0,0.7); plt.xlim(0,15)
plt.legend()
plt.show()
When you do this, you get:

$ $ 
The probability density function of the exponential distribution is given by the following formula and is expressed as $ Ex (\ lambda) 
In the beta distribution, the random variable $ X $ takes a value on the interval (0,1), and its probability density function is 
I've only introduced a few, but that's all for Chapter 3. Thank you very much.
"Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa
Recommended Posts