I just want to find the 95% confidence interval for the difference in population ratios in Python

Confidence interval of ** difference in population ratio **, not confidence interval of population ratio.

What is the difference in population ratio?

Detailed explanation is omitted here. The following site is easy to understand.

Confidence interval for difference in population ratio

Why you want to ask

In business, we often perform "chi-square test" and "test for difference in population ratio". Of course, it is important to pay attention to the conclusion that there is a significant difference **, but if you just pay attention to it, it is difficult to grasp the effect size and variation **. Let's make it a little more intuitive! The flow.

The confidence interval for the population ratio seems to be found in the library, but it seems that the confidence interval for the difference in population ratio is not done (1 minute survey). How to use Python to estimate the 95% confidence interval for the population ratio and determine a reasonable sample size

The calculation formula is not complicated, so implement it quickly.

a formula

(\hat{p_1} - \hat{p_2}) - z_\frac{\alpha}{2} \times \sqrt{\frac{\hat{p_1}(1 - \hat{p_1})}{n_1} + \frac{\hat{p_2}(1 - \hat{p_2})}{n_2}} \leq \hat{p_1} - \hat{p_2} \leq \\ (\hat{p_1} - \hat{p_2}) + z_\frac{\alpha}{2} \times \sqrt{\frac{\hat{p_1}(1 - \hat{p_1})}{n_1} + \frac{\hat{p_2}(1 - \hat{p_2})}{n_2}}

The detailed explanation is explained in the site introduced earlier. The left expression is called lower bound, and the right expression is called upper bound.

If the lower bound and upper bound do not cross 0, it can be said that there is a significant difference. How to find the 95% confidence interval? Relationship with significant differences and the meaning and formula of 1.96

Source code

It's a religion that doesn't move, so I love it.

Image of feeding a 2x2 cross tabulation table with csv.

Purchase Not purchased
Man 50 100
woman 40 120

main.py


import csv
import numpy as np

#Parameters
z = 1.96

#Read test data
with open('test.csv') as f:
    reader = csv.reader(f, quoting=csv.QUOTE_NONNUMERIC)
    d = [row for row in reader]

#Calculate population ratio
p = [d[0][0]/sum(d[0]), d[1][0]/sum(d[1])]

# 95%Calculate confidence interval
lb = (p[0]- p[1]) - z * np.sqrt(p[0] * (1 - p[0]) / sum(d[0]) + p[1] * (1 - p[1]) / sum(d[1]))
ub = (p[0]- p[1]) + z * np.sqrt(p[0] * (1 - p[0]) / sum(d[0]) + p[1] * (1 - p[1]) / sum(d[1]))

#Output result
print('95 of the difference in population ratio%Confidence interval: {:.3f} <= p1 - p2 <= {:.3f}'.format(lb, ub))

in conclusion

It may have been a niche, but it should be convenient ...

Recommended Posts

I just want to find the 95% confidence interval for the difference in population ratios in Python
I want to display the progress in Python!
Find the difference in Python
I want to write in Python! (3) Utilize the mock
I want to use the R dataset in python
I want to absorb the difference between the for statement on the Python + numpy matrix and the Julia for statement
I want to know the population of each country in the world.
I tried to find out the difference between A + = B and A = A + B in Python, so make a note
I want to find variations in various statistics! Recommendation for re-sampling (Bootstrap)
I want to batch convert the result of "string" .split () in Python
I want to explain the abstract class (ABCmeta) of Python in detail.
I want to do Dunnett's test in Python
I want to create a window in Python
I want to merge nested dicts in Python
[Python] I want to know the variables in the function when an error occurs!
I want to use Python in the environment of pyenv + pipenv on Windows 10
I searched for the skills needed to become a web engineer in Python
I want to get the file name, line number, and function name in Python 3.4
I want to write in Python! (1) Code format check
I tried to graph the packages installed in Python
I want to embed a variable in a Python string
I want to easily implement a timeout in python
I want to write in Python! (2) Let's write a test
Even in JavaScript, I want to see Python `range ()`!
I want to randomly sample a file in Python
I want to inherit to the back with python dataclass
I want to work with a robot in python.
I want to do something in Python when I finish
I want to manipulate strings in Kotlin like Python!
I want to replace the variables in the python template file and mass-produce it in another file.
I used Python to find out about the role choices of the 51 "Yachts" in the world.
I tried to find the entropy of the image with python
[TensorFlow] I want to master the indexing for Ragged Tensor
I want to initialize if the value is empty (python)
maya Python I want to fix the baked animation again.
I want to move selenium for the time being [for mac]
I want to do something like sort uniq in Python
What I did to welcome the Python2 EOL with confidence
[Python] I want to use the -h option with argparse
I didn't know how to use the [python] for statement
I just wrote the original material for the python sample code
I tried to implement the mail sending function in Python
I want to know the features of Python and pip
I want to make the Dictionary type in the List unique
I want to align the significant figures in the Numpy array
I want to be able to run Python in VS Code
I want to make input () a nice complement in python
I want to create a Dockerfile for the time being.
I didn't want to write the AWS key in the program
[For beginners] I want to explain the number of learning times in an easy-to-understand manner.
How to find the cumulative sum / sum for each group using DataFrame in Spark [Python version]
Do you want to wait for general purpose in Python Selenium?
I want to automatically find high-quality parts from the videos I shot
I want to know the weather with LINE bot feat.Heroku + Python
[Python] Solving the import problem due to the difference in entry points
[Linux] I want to know the date when the user logged in
I want to solve APG4b with Python (only 4.01 and 4.04 in Chapter 4)
I want to output the beginning of the next month with Python
In the python command python points to python3.8
I want to run the Python GUI when starting Raspberry Pi
I want to find the shortest route to travel through all points