Introduction

While studying statistical tests, various probability distributions come out, but I think it's hard to get an image just by looking at mathematical formulas. While moving various parameters in Python, draw the probability distribution and attach the image. (The previous post was here. Last time, I focused on the binomial distribution and Poisson distribution.)

reference

For the explanation of the probability distribution, refer to the following.

-Statistics time -Introduction to Statistics (Basic Statistics I) Department of Statistics, Faculty of Liberal Arts, University of Tokyo

Various probability distributions

This article does not give detailed explanations such as the derivation of various mathematical formulas, but focuses on grasping the shape of each distribution and the meaning of that distribution. This article deals with the following three distributions.

--Geometric distribution --Exponential distribution --Negative binomial distribution

Geometric distribution

The number of times $ X $ follows until the first successful independent trial (Bernoulli trial) with only two consequences, such as "whether the coin is thrown front or back" * * Is called a geometric distribution. It's very similar because the binomial distribution follows the number of successful ** $ n $ times **. (For the binomial distribution, please refer to Previous article.)

――The number of times until 1 is rolled for the first time after rolling the dice ――The number of times until the front appears for the first time after continuing to throw coins

Etc. follow a geometric distribution.

The formula for the probability mass function of the geometric distribution is expressed as follows.


P(X = k) = p(1-p)^{k-1}

$ p $ is the probability of success for that attempt.

Also, when the random variable $ X $ follows a geometric distribution, the expected value $ E (X) $ and the variance $ V (X) $ are as follows.


E(X) = \frac{1}{p}


V(X) = \frac{1-p}{p^2}

For example, the expected value of the number of trials until 1 is rolled for the first time after rolling the dice is $ \ frac {1} {\ frac {1} {6}} = 6 $.

Now let's draw how the shape of the geometric distribution changes as the value of $ p $ (probability of success) changes.


import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation

def comb_(n, k):
    result = math.factorial(n) / (np.math.factorial(n - k) * np.math.factorial(k))
    return result

def geometric_dist(p, k):
    result = ((1 - p) ** (k - 1)) * p
    return result

fig = plt.figure()

def update(a):
    plt.cla()
    
    x =  np.arange(1, 50, 1)

    y = [geometric_dist(a,  i) for i in x]

    plt.bar(x, y, align="center", width=0.4, color="blue", 
                 alpha=0.5, label="binomial p= " + "{:.1f}".format(a))
    
    plt.legend()
    plt.ylim(0, 0.3)
    plt.xlim(0, 50)
    
    
ani = animation.FuncAnimation(fig,
                              update,
                              interval=1000,
                              frames = np.arange(0.1, 1, 0.1),
                              blit=True)
plt.show()
ani.save('Geometric_distribution.gif', writer='pillow')

Here is the geometric distribution from events with a probability of success of $ 10 $% to events with a probability of $ 90 $%. I think you can intuitively understand that the higher the probability of success, the higher the probability of success when the number of trials is small. It turns out that it is almost a miracle to fail $ 10 $ in a row for an event with a success probability of $ 90 $%.

In addition, the geometric distribution has the property that the probability that an event will occur is not affected by the result that occurred before. The image is that just because the front of a coin appears $ 5 $ in a row does not mean that the probability of the next back appearing is high. This is called ** memoryless **. (Meaning that past information is not remembered.)

Exponential distribution

The exponential distribution is a probability distribution that represents the probability that the interval between events that occur an average of $ \ lambda $ times per unit time is $ x $ unit time **, and is used in the following examples.

--Intervals when disasters occur --Interval at which accidental system failures with constant failure times occur ――The interval between the arrival of one customer and the arrival of the next customer in the store

Exponential distribution probability density function

The probability density function of the exponential distribution is expressed as follows.

\begin{equation}
f(x)=
    \left\{
    \begin{aligned}
          &\lambda \mathrm{e}^{-\lambda x}　&(x\geq0) \\
          &0 &(x<0)\\
    \end{aligned}
    \right.
\end{equation}

Also, when the random variable $ X $ follows an exponential distribution, the expected value $ E (X) $ and the variance $ V (X) $ are as follows.


E(X) = \frac{1}{\lambda}


V(X) = \frac{1}{\lambda}

For example, the expected value of the occurrence interval of an event that occurs $ 5 $ per $ 1 $ time (unit time) is $ \ frac {1} {5} time = 12 minutes $. I think it fits the feeling somehow.

Now, let's draw how the shape of the exponential distribution changes when the value of $ \ lambda $ (the average number of events that occur per unit time) changes.

For example, consider ** a distribution that represents the time between customers' visits, from stores with an average of 30 customers per hour to stores with an average of only one customer.

import numpy as np
import matplotlib.animation as animation
import matplotlib.pyplot as plt
from scipy.stats import expon

fig = plt.figure()

def update(a):
    plt.cla()
    
    x =  np.arange(0, 1, 0.01)

    y = [expon.pdf(i, scale = 1/a) for i in x]

    plt.plot(x, y, label="expon λ= %d" % a)
    plt.legend()
    plt.ylim(0, 35)
    plt.xlim(0, 1.0)
    
ani = animation.FuncAnimation(fig,
                              update,
                              interval=500,
                              frames = np.arange(31, 1, -1),
                              blit=True)
plt.show()
ani.save('Exponential_distribution.gif', writer='pillow')

Can you see that the shape of the distribution somehow resembles a geometric distribution? ** Actually, the continuous version of the geometric distribution is the exponential distribution. ** (If you increase the number of trials in the geometric distribution and take an appropriate limit that brings the success probability closer to $ 0 $, you will get an exponential distribution.)

In addition, there are three points in the shape of the exponential distribution.

-The smaller $ \ lambda $, the slower the decrease --Always monotonously decrease regardless of the value of $ \ lambda $ -The closer $ x $ is to $ 0 $, the higher the probability density.

The first point is, "A store with an average of $ 15 $ will come sooner than a store with an average of $ 5 $ in $ 1 hours." It's an image, so I think it's easy to understand.

The second and third points are ** "When a customer comes to the store, it is most likely that the next customer will come to the store immediately after that" **. I think there are some people who feel something is wrong with this. In fact, it is derived from the ** memorylessness ** of the exponential distribution, and just because a customer arrives for $ 1 $ does not mean that the next customer will not come for a while or that it is easy to visit the store in a row. , We assume that customer visits are completely random. Therefore, the idea is that ** there is a higher probability that it will occur sooner than it will last for a long time. (This ignorance is the same as that that appeared in the geometric distribution.)

Cumulative density function of exponential distribution

What we want to know more about our daily feelings is the probability that the next customer will come exactly $ 10 $ minutes later, rather than the probability that the next customer will come within ** $ 10 $ minutes **. In that case, it is necessary to add everything from the probability that the next customer visits the store is $ 0 $ seconds to the probability that it is $ 10 $ minutes.

At this time, the following cumulative density function, which is the result of integrating the probability density function, is used. ** Represents the probability that an event that occurs an average of $ \ lambda $ times per unit time will occur within $ x $ unit time **.


{{\begin{eqnarray}
F(x)  &=& 1 - \mathrm{e}^{-\lambda x} \\
\end{eqnarray}}
}

Now let's draw how the shape of the graph of the cumulative density function changes as the value of $ \ lambda $ (the average number of events that occur per unit time) changes.

As before ** Probability of how many minutes it will take for the next customer to visit, from a store with an average of $ 30 $ per hour to a store with an average of $ 1 $ per hour. Let's think of it as a distribution that represents **.


import numpy as np
import matplotlib.animation as animation
import matplotlib.pyplot as plt
from scipy.stats import expon

fig = plt.figure()

def update(a):
    plt.cla()
    
    x =  np.arange(0, 1.2, 0.01)

    y = [expon.cdf(i, scale = 1/a) for i in x]

    plt.plot(x, y, label="expon λ= %d" % a)
    
    plt.legend()
    plt.ylim(0, 1.2)
    plt.xlim(0, 1.2)
    
ani = animation.FuncAnimation(fig,
                              update,
                              interval=500,
                              frames = np.arange(30, 1, -1),
                              blit=True)
plt.show()
ani.save('Exponential_cumulative_distribution.gif', writer='pillow')

You can see that the curve of the distribution becomes gentler as $ \ lambda $ gets bigger. I think it's convincing that a store that averages $ 30 $ an hour visits a store that has a higher probability of the next customer coming within $ 10 $ minutes than a store that only $ 1 $ people visit.

Negative binomial distribution

Although it rarely appears in the second grade of the statistical test, it is closely related to the distributions that have been dealt with so far, so we also deal with the negative binomial distribution. The distribution followed by $ X $, the number of trials required for an independent trial (Bernoulli trial) to succeed $ r $ times, with only two outcomes, is called a negative binomial distribution.


P(X = k) = {}_{k-1} C _{r-1}p^r(1-p)^{k-r}

--Number of trials until 1 is rolled $ 3 $ by continuing to roll the dice --Number of trials until the table appears $ 5 $ by continuing to throw coins

Etc. follow a negative binomial distribution.

As the name suggests, the negative binomial distribution is an extended version of the binomial distribution and has the following differences.

Binomial distribution: Fixed number of trials, number of successes is a random variable Negative binomial distribution: Fixed number of successes, number of trials is a random variable

If you set $ r = 1 $, it will be a geometric distribution and an expression. (Because the geometric distribution is the probability distribution of the number of trials until it succeeds for the first time.)

Furthermore, when the Poisson distribution $ \ lambda $ follows the gamma distribution, it becomes a negative binomial distribution. (The gamma distribution is beyond the scope of Statistical Test Level 2 and is not covered in this article.)

Also, when the random variable $ X $ follows a negative binomial distribution, the expected value $ E (X) $ and the variance $ V (X) $ are as follows.


E(X) = \frac{r}{p}


V(X) = \frac{k(1-p)}{p^2}

For example, the expected number of trials required to get 1 on the dice 5 times is $ \ frac {5} {\ frac {1} {6}} = 30 $.

Now let's move the probability of success between $ 10 $% and $ 90 $% for the probability distribution (negative binomial distribution) of the number of trials required for an event to succeed $ 10 $ times.


import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation

def comb_(n, k):
    result = math.factorial(n) / (np.math.factorial(n - k) * np.math.factorial(k))
    return result

def negative_binomial_dist(p, k, r):
    result = comb_(k - 1, r - 1) * (p**r) * ((1 - p) ** (k - r))
    return result

fig = plt.figure()

def update(a):
    plt.cla()
    
    r = 10
    
    x =  np.arange(r, 70, 1)

    y = [negative_binomial_dist(a, i, r) for i in x]

    plt.bar(x, y, align="center", width=0.4, color="blue", 
                 alpha=0.5, label="Negative binomial p= " + "{:.1f}".format(a))
    
    plt.legend()
    plt.ylim(0, 0.4)
    plt.xlim(10, 70)
    
    
ani = animation.FuncAnimation(fig,
                              update,
                              interval=1000,
                              frames = np.arange(0.1, 1, 0.1),
                              blit=True)
plt.show()
ani.save('Negative_binomial_distribution.gif', writer='pillow')

You can see that the higher the probability of success, the closer the number of trials to $ 10 $ times to success.

Relationship between probability distributions

I have dealt with various probability distributions that are the range of questions for Statistical Test Level 2, but in reality they are all related. The following is a diagram of the mutual relationships of the probability distributions that we have dealt with so far.

分布間の関係性.png

It's hard to understand if you look only at the mathematical formulas, but I think that understanding will deepen if you think about the relationship between the mutual probability distributions while actually drawing the distribution.

NEXT Next time, I will talk about normal distribution and t distribution.

Statistical test grade 2 probability distribution learned in Python ②