Collectively implement statistical hypothesis testing in Python

Introduction

I summarized the statistical hypothesis test. Later, I would like to write an article that organizes "detection power" and "effect size", so I wrote this article as the first step. I am not an expert in statistics, so I would appreciate it if you could point out any mistakes.

reference

The following is used as a reference when summarizing the "statistical hypothesis test".

-Statistics time -Introduction to Statistics (Basic Statistics I) Department of Statistics, Faculty of Liberal Arts, University of Tokyo -What is a hypothesis test?

About statistical hypothesis testing

We will organize the flow of statistical hypothesis testing and finally implement it in Python.

What is a statistical hypothesis test?

Wikipedia explains the statistical hypothesis test as follows.

Statistical hypothesis testing is one of the statistical methods to test the hypothesis about the population parameter of the population distribution from the sample. The Japanese Industrial Standards defines a statistical hypothesis as "a declaration of a population parameter or probability distribution. There are null hypotheses and alternative hypotheses." The test (statistical test) is a statistical procedure for deciding whether to reject the null hypothesis and support the alternative hypothesis or not to reject the null hypothesis based on the observed values. The procedure is that the null hypothesis holds. It is decided that the probability of rejection is α or less even though it is done. This α is called the significance level. "

Although technical terms such as "null hypothesis", "alternative hypothesis", and "significance level" have come out and are difficult to define, I understand that it is a method to carry out verification with the following logic. ..

** Assuming that a hypothesis is correct, when calculating the probability of becoming the state of that hypothesis from the actually observed data, if the probability is small enough, it is judged that the hypothesis is unlikely to hold **

Statistical hypothesis testing procedure

The statistical hypothesis test is performed according to the following procedure.

  1. Make a null hypothesis
  2. Determine the test method
  3. Determine significance level and rejection area
  4. Calculate statistics
  5. Reject or adopt the null hypothesis

If only the procedure is listed, it is abstract and difficult to understand, so I will explain it with a concrete example.

Specific example of statistical hypothesis test


We played a coin toss 5 times, and played a game where you would get 500 yen when the front side came out and 500 yen when the back side came out. Then, as a result, the table came out all five times and I had to pay 2500 yen. Somehow it smells like squid, but can this coin be said to be bogus?


Some people may say, "Isn't it sometimes that the table appears five times in a row?", While others say, "It's strange that the table appears five times in a row." Such things can be judged objectively using statistical hypothesis tests.

Make a null hypothesis

This time, as a hypothesis that I want to return to nothing (a hypothesis that I want to deny), I make a hypothesis that ** this coin is not a fake **. Since it is not a fake, the probability that the table will appear is $ p = 0.5 $, so it is expressed as follows.

Null hypothesis: $ H_ {0}: p = 0.5 $

The alternative hypothesis (the probability that the table will appear is 50% or more) is as follows.

Alternative hypothesis: $ H_ {0}: p> 0.5 $

Determine the test method

This time, we will use the binomial test. (If the number of samples is large, the binomial distribution can be approximated to a normal distribution, so other methods can be used.)

Determine significance level and rejection area

In this test, the significance level is set to 5% **. Null hypothesis: On the assumption that $ H_ {0}: p = 0.5 $, if the probability of obtaining observation data is 5% or less **, the null hypothesis is rejected (that is, ** this coin is bogus). There is **).

Also, this time, we will perform a ** one-sided test ** for a test ** to verify the suspicion that the coin may be abnormally easy to appear. The rejection area will be on one side only.

Calculate statistics

The statistic called the p-value represents the probability that the realized value of the observed data will be obtained, or the probability that more extreme data will be obtained, given that the null hypothesis is correct.

Therefore, the p value in this case is the probability that the table will appear 5 times, assuming that the probability that the table will appear is $ 50 % $.

(\frac{1}{2})^5 = \frac{1}{32} \fallingdotseq 0.03125

Reject or adopt the null hypothesis

Now that we have the information necessary for the hypothesis test, we will decide to reject or adopt the null hypothesis. The significance level of this hypothesis test was $ 0.05 $ ($ 5 % $), and the statistic (p value) was calculated as $ 0.03125 $.

Since $ 0.03125 <0.05 $, the null hypothesis is rejected and the alternative hypothesis is adopted.

Therefore, we were able to verify that $ H_ {0}: p> 0.5 $ (this coin is bogus).

Perform a statistical hypothesis test in Python

The above calculation can be easily performed in Python. Below are the results of a binomial test using scipy 1.3.1.


from scipy import stats
#x is the number of successful observation data
#n is the number of trials
#p is the expected success probability
#alternative specifies whether it is a two-sided test or a one-sided test, and if it is a one-sided test, which side it is.
p = stats.binom_test(x = 5, n = 5,  p = 0.5, alternative = 'greater' )
print(p)

The output result is here. The p-value can be output according to the specified argument.

0.03125

The binomial test can be easily performed by making a rejection or acceptance decision according to the significance level set using the above. (If you want to perform another test, use another method.)

Draw distribution in Python

Up to now, the hypothesis test was performed only by calculation, but it becomes very easy to understand when actually drawing the distribution. Perform a coin toss 5 times and draw the distribution of the number of times the table appears.

import numpy as np
import matplotlib.pyplot as plt
import math
%matplotlib inline

def comb_(n, k):
    result = math.factorial(n) / (np.math.factorial(n - k) * np.math.factorial(k))
    return result


def binomial_dist(p, n, k):
    result = comb_(n, k) * (p**k) * ((1 - p) ** (n - k))
    return result

x =  np.arange(0, 6, 1)

y = [binomial_dist(0.5, 5, i) for i in x]

plt.bar(x, y, alpha = 0.5)

ダウンロード.png

The above is the result of drawing, but you can immediately see that the probability that the table appears 5 times is below the significance level of 0.05 **.

NEXT Next time, I will summarize the type I errors, type II errors, and detection power in hypothesis testing.

Recommended Posts

Collectively implement statistical hypothesis testing in Python
Implement recommendations in Python
Implement XENO in python
Implement sum in Python
Implement Traceroute in Python 3
Implement naive bayes in Python 3.3
Implement ancient ciphers in python
Implement Redis Mutex in Python
Implement extension field in Python
Implement fast RPC in Python
Implement method chain in Python
Implement Dijkstra's Algorithm in python
Implement Slack chatbot in Python
Implement stacking learning in Python [Kaggle]
Implement R's power.prop.test function in python
Testing with random numbers in Python
Implement the Singleton pattern in Python
Statistical test (multiple test) in Python: scikit_posthocs
Quickly implement REST API in Python
Implement __eq__ etc. generically in Python class
I tried to implement permutation in Python
Implement FIR filters in Python and C
I tried to implement PLSA in Python 2
I tried to implement ADALINE in Python
I tried to implement PPO in Python
Introduction to Statistical Hypothesis Testing with stats models
Quadtree in Python --2
Python in optimization
CURL in python
Try to implement Oni Maitsuji Miserable in python
Geocoding in python
SendKeys in Python
Try to calculate a statistical problem in Python
How to implement Discord Slash Command in Python
Meta-analysis in Python
Unittest in python
Statistical test grade 2 probability distribution learned in Python ②
How to implement shared memory in Python (mmap.mmap)
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Let's implement English voice dialogue in Python [offline]
Plink in Python
I tried to implement TOPIC MODEL in Python
Constant in python
Testing methods that return random values in Python
Lifegame in Python.
FizzBuzz in Python
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Reflection in Python
[Implementation for learning] Implement Stratified Sampling in Python (1)