Bayesian statistics hypothesis test

In this article, I will introduce a method of hypothesis testing with Bayesian statistics, and finally I will actually test using python. I'm new to Bayesian statistics and python, so if you find something wrong, please comment.

1. What is a hypothesis test?

$$ Roughly speaking, a hypothesis test is a hypothesis that an unknown parameter $ \ theta $ belongs to an interval $ I_i $. Is it okay to set up a hypothesis $ H_i $ and adopt the hypothesis $ H_i $, that is, an unknown parameter? It is a method to analyze whether it can be said that $ \ theta $ belongs to the interval $ I_i $.

The following three methods are mainly typical of Bayesian statistics.

--Test by posterior probability --Test by ex post facto odds ratio --Test by Bayes factor

Below $$, the probability density function of the random variable $ X $ is represented by $ p (X) $, and the probability that the event $ X $ will occur is represented by $ P (X) $. Also, using $ \ Omega $ as the sample space, $ A \ bot B $ indicates that $ A \ cap B = \ phi $ holds for the sets $ A and B $.

2. Test by posterior probability

The idea of testing by evaluating posterior probabilities is very simple. For example, consider the following hypothesis.

\left\{
\begin{split}
    H_0: &q \in I_0 \\
    H_1: &q \in I_0 \\
    &\vdots         \\
    H_k: &q \in I_k
\end{split}
\right.

$$ (However, please note that it is $ I_i \ subset \ Omega $ and $ I_i $ can share space with each other.)

$$ At this time, the posterior probabilities that the hypothesis $ H_i (i = 0, 1, \ dots k) $ holds when the event $ X $ is realized are each.

 P(H_i|X) = P(\theta \in I_i|X) = \int_{I_i} p(\theta|X) d\theta

It will be.

If $ P (H_i | X) $ is large enough (close to 1), the hypothesis $ H_i $ is adopted, and if it is small (close to 0), the hypothesis $ H_i $ is rejected.

3. Test by posterior odds ratio

Next is a test using the posterior odds ratio. You could make as many hypotheses as you like in the posterior probability test, but only two hypotheses are used in the posterior odds ratio test.

Consider the following hypothesis.

\left\{
\begin{split}
    H_0: q \in I_0 \\
    H_1: q \in I_1
\end{split}
\right.

$$ Here, $ I_0 $ and $ I_1 $ are set so that $ I_0 \ bot I_1, I_0 \ cup I_1 = \ Omega $ holds. By doing this, one of $ H_0 $ and $ H_1 $ can be the null hypothesis and the other the alternative hypothesis.

The ex post facto odds ratio is

\begin{split}
Post-hoc odds ratio&= \frac{P(H_0|X)}{P(H_1|X)} \\
			 &= \frac{P(\theta \in I_0|X)}{P(\theta \in I_1|X)} \\
			 &= \frac{\int_{I_0}p(\theta|X)d\theta}{\int_{I_1}p(\theta|X)d\theta}
\end{split}

It is defined by $$. If the posterior odds ratio is sufficiently greater than 1, $ H_0 $ is adopted, and if it is sufficiently smaller, $ H_1 $ is adopted.

$ The above is the test method using the posterior odds ratio, but there are some problems considering that the posterior distribution is affected by the prior distribution. If the prior distribution is extremely favorable to either hypothesis, the posterior distribution may not be correctly tested to reflect the effects of the prior distribution. For example, in realityH_1$Is correctH_0Assuming an overwhelmingly favorable prior distribution,P(H_0|X)Is larger than it really isP(H_1|X)Is underestimated and the posterior odds ratio can be overestimated. To avoid this, we may test using Bayes factor.

4. Bayes factor test

$$ Immediately, the Bayes factor $ B_ {01} $ is defined by the following formula.

\begin{split} B_{01} &=Post-hoc odds ratio\div pre-odds ratio\\
          &= \frac{P(H_0|X)}{P(H_1|X)} \div \frac{P(H_0)}{P(H_1)} \\
          &=\frac{\int_{I_0}p(\theta|X)d\theta}{\int_{I_1}p(\theta|X)d\theta} \div \frac{\int_{I_0}p(\theta)d\theta}{\int_{I_1}p(\theta)d\theta}
\end{split}

Now let's consider why this Bayes factor evaluation avoids the problem of the posterior odds ratio test.

Suppose that the setting of the $$ prior distribution is overwhelmingly advantageous to $ H_0 $, and the prior odds ratio is 100 and the posterior odds ratio is 20. At this time, $ H_0 $ will be adopted in the test based on the posterior odds ratio. Also, just in case, if you think about what happens in the test by post-establishment, it will be $ P (H_0 | X) \ simeq 0.95 $, and it can be said that $ H_0 $ is also adopted here. But if you calculate the Bayes factor $ B_ {01} $, you get $ B_ {01} = 0.2 $. The value of Bayes factor is evaluated using the evaluation criteria by Jeffreys shown below. Since it is $ 1/10 \ le 0.2 \ le 1/3 $, it fully supports the alternative hypothesis $ H_1 $ in this case.

無題.png

[Source: https://www.slideshare.net/masarutokuoka/mcmc-35640699]

The above example is extreme, but the actual test using Bayes factor may lead to a different conclusion from the case of using the posterior odds ratio, and the test using Bayes factor is more affected by the prior distribution. It is thought that it is difficult to draw a wrong conclusion by making the size smaller.

5. Actually test the sample data using python

I actually calculated the posterior probability, the posterior odds ratio, and the Bayes factor and tried the hypothesis test. This time, for the sake of simplicity, we assumed a beta distribution as the prior distribution, and performed a hypothesis test on a model that follows the Bernoulli distribution.

First, define a function that takes sample data (data), hyperparameters of prior distribution (a0, b0), and interval (I) as arguments and returns posterior probability (post_prob), posterior odds ratio (post_odds_ratio), and Bayes factor (BF). Did.

import scipy.stats as st

def ppb(data, a0, b0, I):
    n = data.size
    y = data.sum()
    pre_p0 = st.beta.cdf(I[1], a0, b0) - st.beta.cdf(I[0], a0, b0)
    pre_p1 = 1 - pre_p0
    post_p0 = st.beta.cdf(I[1], y + a0, n - y + b0) - st.beta.cdf(I[0], y + a0, n - y + b0)
    post_p1 = 1 - post_p0
    post_prob = post_p0
    post_odds_ratio = post_p0 / post_p1
    BF = post_odds_ratio * pre_p1 / pre_p0
    return post_prob, post_odds_ratio, BF

We then randomly generated data from the Bernoulli distribution with parameter p = 0.7 to generate a sample with a sample size of 100.

import numpy as np

p = 0.7
n = 100
np.random.seed(0)
data = st.bernoulli.rvs(q, size=n)

For this data, hyperparameters

--A) a0 = b0 = 1 --B) a0 = 2, b0 = 5 --C) a0 = 3, b0 = 2

We analyzed the case of. The interval (I) used as the null hypothesis is I = [0.6, 0.8].

a0 = [1,2,3]
b0 = [1,5,2]
I = [0.6,0.8]
s = ['A','I','C']

for i in range(len(a0)):
    a, b, c = ppb(data, a0[i], b0[i], I)
    print('{}) \n posterior probability: {:.5f} \n posterior odds ratio: {:.5f} \n Bayes factor: {:.5f}'.format(s[i],a,b,c))

Execution result

A)
Posterior probability: 0.79632
Post-hoc odds ratio: 3.90973
Bayes factor: 15.63891
I)
Posterior probability: 0.93228
Post-hoc odds ratio: 13.76698
Bayes factor: 336.00387
C)
Posterior probability: 0.81888
Post-hoc odds ratio: 4.52127
Bayes factor: 8.62195

Here, when the test is performed with reference to the Bayes factor evaluation criteria by Jeffreys shown earlier, it is concluded that a) the null hypothesis is strongly supported, b) the null hypothesis is very strongly supported, and c) the null hypothesis is sufficiently supported. Was done. Since we set the true value of p to 0.7, we can say that we have drawn reasonable conclusions.

Also, looking at the beta distribution graph below, a) is most likely to be affected by the prior distribution.

The image below is an actual graph of the posterior distribution. Certainly, a) seems to be more affected by the prior distribution than a) and c).

事後分布.png

Furthermore, when only the interval of the null hypothesis was changed to [0.65, 0.75], the following results were obtained.

A)
Posterior probability: 0.34440
Post-hoc odds ratio: 0.52532
Bayes factor: 4.72784
I)
Posterior probability: 0.57232
Post-hoc odds ratio: 1.33818
Bayes factor: 74.33720
C)
Posterior probability: 0.36755
Post-hoc odds ratio: 0.58115
Bayes factor: 2.73401

The conditions become stricter and it is difficult to judge from posterior probabilities and posterior odds ratios alone, but using the Bayes factor seems to be able to support the null hypothesis in either case.

Finally, I set the interval of the null hypothesis as [0.4, 0.6] and experimented what kind of conclusion can be obtained when the true value is not included in the interval.

A)
Posterior probability: 0.00019
Post-hoc odds ratio: 0.00019
Bayes factor: 0.00078
I)
Posterior probability: 0.00123
Post-hoc odds ratio: 0.00123
Bayes factor: 0.00515
C)
Posterior probability: 0.00020
Post-hoc odds ratio: 0.00020
Bayes factor: 0.00049
77

It is not necessary to use the Bayes factor, but in either case the null hypothesis is properly rejected.

6. Finally

We assumed the prior distribution as a beta distribution and only dealt with models that follow the Bernoulli distribution, but I would like to deal with more various models in the future. If you have any deficiencies or mistakes, please let me know. Thank you for reading until the end.

References

[Teruo Nakatsuma; Introduction to Bayesian Statistics with Python, Asakura Shoten, 2019](https://www.amazon.co.jp/ by Python-Introduction to Bayesian Statistics-Practical Python Library-Teruo Nakatsuma-dp / 4254128983)

Research report at MCMC (see 2020-4-24)

[PYTHON] Bayesian statistics hypothesis test