[PYTHON] Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution

From Bayesian inference experienced in Python

"Bayesian Inference Experienced in Python"

I wrote it once, but I'm not used to it, so I deleted this item. That's why I wrote it again. .. Tohoho.

What is Bayesianism?

Frequency and Bayesian

Frequencyism is a classic statistic that has been around for a long time. In frequency principle, probability is regarded as "frequency of events in a long period of time".

Bayesianism considers probability as the degree of "belief" or "confidence" at which an event occurs. Considering probability as a belief is actually a natural idea for humans.

"Probability is a belief"

The belief that an event occurs is expressed as $ P (A) $, called prior probability, and the belief updated with evidence $ X $ is expressed as $ P (A | X) $. This is the probability of $ A $ given evidence X. This is called posterior probability.

Bayesian inference concept

Bayesian inference functions return probabilities, while frequency-based inference functions return numerical values that represent estimates. .. This is important and should be remembered.

For example, consider a program example If you think in a frequency-based way, "This program has passed all the tests. (Information X) Is this program okay?", You would say "Yes, there are no bugs in the program."

If you think of this in Bayesianism, you answer: "Yes, the probability of having no bugs is 0.8. No, the probability of having bugs is 0.2." In Bayesianism, you can always add prior knowledge that a program has a bug as an argument.

When the amount of evidence (information X) increases and an infinite number of evidences (very large) are collected, frequencyism and Bayesianism result in similar inference results.

About big data

A relatively simple algorithm is used for analysis and prediction using big data. In other words, the difficulty of analyzing big data lies not in the algorithm. The most difficult problems are "medium data" and "small data". Bayesianism comes to life here.

Bayes' theorem

Bayes' theorem (Bayes' law)


P( A | X ) = \displaystyle \frac{ P(X | A) P(A) } {P(X) }

Bayesian inference only mathematically connects the prior probability $ P (A) $ with the updated posterior probability $ P (A | X) $.

Probability distribution

For discrete values

If $ Z $ is a discrete value, the probability distribution is a probability mass distribution. This is the probability that $ Z $ will take $ k $. This is expressed as $ P (Z = k) $. There is a Poisson distribution in the probability mass function. The probability mass function of $ Z $ follows a Poisson distribution and is expressed by the following equation.


P(Z = k) =\frac{ \lambda^k e^{-\lambda} }{k!}, \; \; k=0,1,2, \dots

$ \ lambda $ is a parameter that determines the shape of the distribution, and in the case of Poisson distribution, $ \ lambda $ is a positive real number. Increasing $ \ lambda $ increases the probability of large values, and decreasing $ \ lambda $ increases the probability of small values. So $ \ lambda $ is the intensity of the Poisson distribution.

$ k $ is a non-negative integer. Note that $ k $ is an integer. Write that the random variable $ Z $ follows a Poisson distribution as follows.


 Z\sim \text{Poi}(\lambda)

A convenient property of the Poisson distribution is that the expected value is equal to the distribution parameter.


E\large[ \;Z\; | \; \lambda \;\large] = \lambda

It is a plot of the probability mass function with $ \ lambda $ changed.

# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt 
import scipy.stats as stats
import numpy as np
a = np.arange(25)
#Poisson distribution function of scipy
poi = stats.poisson
lambda_ = [1.5, 4.25,8.50]
colours = ["#348ABD", "#A60628","#5AFF19"]

plt.bar(a, poi.pmf(a, lambda_[0]), color=colours[0],
        label="$\lambda = %.1f$" % lambda_[0], alpha=0.60,
        edgecolor=colours[0], lw="3")

plt.bar(a, poi.pmf(a, lambda_[1]), color=colours[1],
        label="$\lambda = %.1f$" % lambda_[1], alpha=0.60,
        edgecolor=colours[1], lw="3")

plt.bar(a, poi.pmf(a, lambda_[2]), color=colours[2],
        label="$\lambda = %.1f$" % lambda_[2], alpha=0.60,
        edgecolor=colours[1], lw="3")


plt.xticks(a + 0.4, a)
plt.legend()
plt.ylabel("probability of $k$")
plt.xlabel("$k$")
plt.title("Probability mass function of a Poisson random variable;\
          differing \$\lambda$ values");

There were two books, $ \ lambda = 1.5,4.25 $, so I added another 8.5 and calculated three. 1_3_1_Figure 2020-08-09 132457.png

Increasing $ \ lambda $ increases the probability of large values, and decreasing $ \ lambda $ increases the probability of small values, which is shown in the graph as a result.

For continuous values

It is represented by a probability density distribution function, not a continuous random variable or a random mass variable. The probability density function has an exponential distribution.

f_Z(z | \lambda) = \lambda e^{-\lambda z }, \;\; z\ge 0

Random variables in the exponential distribution take non-negative values. Since it is a continuous value, it is suitable for data that takes positive real values such as time and temperature (Kelvin). The random variable $ Z $ follows an exponential distribution if the density distribution function is an exponential distribution. In other words

Z \sim \text{Exp}(\lambda)

The expected value of the exponential distribution is the reciprocal of the parameter $ \ lambda $.

E[\; Z \;|\; \lambda \;] = \frac{1}{\lambda}

# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt 
import scipy.stats as stats
import numpy as np

a = np.linspace(0, 10, 100)
expo = stats.expon
lambda_ = [0.5, 1, 5]
colours = ["#348ABD", "#A60628","#5AFF19"]

for l, c in zip(lambda_, colours):
    plt.plot(a, expo.pdf(a, scale=1./l), lw=3,
             color=c, label="$\lambda =!
 %.1f$" % l)
    plt.fill_between(a, expo.pdf(a, scale=1./l), color=c, alpha=.33)

plt.legend()
plt.ylabel("PDF at $z$")
plt.xlabel("$z$")
plt.ylim(0,1.2)
plt.title("Probability density function of an Exponential random variable;\
 differing $\lambda$");

1_3_2_Figure 2020-08-09 132457.png

What is λ?

You can't get $ \ lambda $. All we know is $ Z $. Also, $ \ lambda $ and $ Z $ have no one-to-one relationship.

What is the value of $ \ lambda $ that Bayesian inference deals with? Belief. So it's important to think about the probability distribution for $ \ lambda $, which says that $ \ lambda $ is likely to be this value, rather than exactly finding the value of $ \ lambda $.

If you think that $ \ lambda $ is a constant, so it's not a random variable, and how you can give a probability to the value of a constant that isn't random at all, it means you're already invaded by frequencyism. Is. Bayesianism considers probability to be a belief, so in fact you can assign it to anything.

In other words

"Probability is a belief"

That's what it means.

Next article "Bayesian inference concept (3) ... actual calculation by pymc3"