Implement R's power.prop.test function in python

――I want to find sample size and verification ability when designing a goodness-of-fit test. --You can find it by using R's power.prop.test function. --I want to use a similar function in python ...

So, as an exercise, I implemented it in python based on the source code of R's power.prop.test function. Please refer to R official documentation for the meaning of the arguments. The code is at the end of the article.

Example of use

Find the sample size

--Assuming that Group 1 has a 1% chance and Group 2 has a 3% chance of having some effect. --Significance level is 0.05 --Detective power is 0.8 --One-sided test

Find the sample size n to satisfy the above conditions.

# python
> params_dict = power_prop_test(p1=0.01, p2=0.03, power=0.8, sig_level=0.05, alternative="one_sided")

    Sample size: n = 604.845426012357
    Probability: p1 = 0.01, p2 = 0.03
    sig_level = 0.05,
    power = 0.8,
    alternative = one_sided
    
    NOTE: n is number in "each" group

> print(params_dict)

{'n': 604.845426012357, 'p1': 0.01, 'p2': 0.03, 'sig_level': 0.05, 'power': 0.8, 'alternative': 'one_sided'}

It is almost the same as the result of passing the same value in R.

# R
> power.prop.test(p1=0.01, p2=0.03, power=0.8, sig.level=0.05, alternative="one.sided")

     Two-sample comparison of proportions power calculation 

              n = 604.8434
             p1 = 0.01
             p2 = 0.03
      sig.level = 0.05
          power = 0.8
    alternative = one.sided

NOTE: n is number in *each* group

Find detection power

Next, let's find the detection power (power) when the sample size n = 1000 in the same way.

#python
> power_prop_test(n=1000, p1=0.01, p2=0.03, sig_level=0.05, alternative="one_sided")

    Sample size: n = 1000
    Probability: p1 = 0.01, p2 = 0.03
    sig_level = 0.05,
    power = 0.9398478094069961,
    alternative = one_sided
    
    NOTE: n is number in "each" group
#R
> power.prop.test(n=1000, p1=0.01, p2=0.03, sig.level=0.05, alternative="one.sided")

     Two-sample comparison of proportions power calculation 

              n = 1000
             p1 = 0.01
             p2 = 0.03
      sig.level = 0.05
          power = 0.9398478
    alternative = one.sided

NOTE: n is number in *each* group

This also takes almost the same value.

Supplement

--It may be easier to call the R function directly with PypeR etc. that runs R on python. --Depending on the initial value, it may not converge and may throw an error.

Thank you for reading.

Power_prop_test function used

#Set the desired parameter to None
#alternative is"one_sided" "two_sided"Select either
#Returns each obtained parameter as a dictionary type
def power_prop_test(n=None, p1=None, p2=None, sig_level=0.05, power=None,
                    alternative="one_sided", strict=False,
                    tol=2.220446049250313e-16**0.25):
    
    from math import sqrt
    from scipy.stats import norm
    from scipy.optimize import brentq
    
    missing_params = [arg for arg in [n, p1, p2, sig_level, power] if not arg]
    num_none = len(missing_params)
    
    if num_none != 1:
        raise ValueError("exactly one of 'n', 'p1', 'p2', 'power' and 'sig.level'must be None")
        
    if sig_level is not None:
        if sig_level < 0 or sig_level > 1:
            raise ValueError("\"sig_level\" must be between 0 and 1")

    if power is not None:
        if power < 0 or power > 1:
            raise ValueError("\"power\" must be between 0 and 1")
        
    if alternative not in ["two_sided", "one_sided"]:
        raise ValueError("alternative must be \"two_sided\" or \"one_sided\"")
    t_side_dict = {"two_sided":2, "one_sided":1}
    t_side = t_side_dict[alternative]
        
    # compute power 
    def p_body(p1=p1, p2=p2, sig_level=sig_level, n=n, t_side=t_side, strict=strict):
        if strict and t_side==2:
            qu = norm.ppf(1-sig_level/t_side)
            d = abs(p1-p2)
            q1 = 1-p1
            q2 = 1-p2
            pbar = (p1 + p2) / 2
            qbar = 1 - pbar
            v1 = p1 * q1
            v2 = p2 * q2
            vbar = pbar * qbar
            power_value = (norm.cdf((sqrt(n) * d - qu * sqrt(2 * vbar))/sqrt(v1 + v2))
                    + norm.cdf(-(sqrt(n) * d + qu * sqrt(2 * vbar))/sqrt(v1 + v2)))
            return power_value
                
        else:
            power_value = norm.cdf((sqrt(n) * abs(p1 - p2) - (-norm.ppf(sig_level / t_side)
                        * sqrt((p1 + p2) * (1 - (p1 + p2)/2)))) / sqrt(p1 * 
                        (1 - p1) + p2 * (1 - p2)))
            return power_value
    
    if power is None:
        power = p_body()

    # solve the equation for the None value argument 
    elif n is None:
        def n_solver(x, power=power):
            return p_body(n=x) - power
        n = brentq(f=n_solver, a=1, b=1e+07, rtol=tol, maxiter=1000000)

    elif p1 is None:
        def p1_solver(x, power=power):
            return p_body(p1=x) - power
        try:
            p1 = brentq(f=p1_solver, a=0, b=p2, rtol=tol, maxiter=1000000)
        except:
            ValueError("No p1 in [0, p2] can be found to achive the desired power")
        
    elif p2 is None:
        def p2_solver(x, power=power):
            return p_body(p2=x) - power
        try:
            p2 = brentq(f=p2_solver, a=p1, b=1, rtol=tol, maxiter=1000000)
        except:
            VealueError("No p2 in [p1, 1] can be found to achive the desired power")

    elif sig_level is None:
        def sig_level_solver(x, power=power):
            return p_body(sig_level=x) - power
        try:
            sig_level = brentq(f=sig_level_solver, a=1e-10, b=1-1e-10, rtol=tol, maxiter=1000000)
        except:
            ValueError("No significance level [0,1] can be found to achieve the desired power")
            
    print("""
    Sample size: n = {0}
    Probability: p1 = {1}, p2 = {2}
    sig_level = {3},
    power = {4},
    alternative = {5}
    
    NOTE: n is number in "each" group
    """.format(n, p1, p2, sig_level, power, alternative))
    #Each parameter is returned in a dictionary with argument names as keys
    params_dict = {"n":n, "p1":p1, "p2":p2,
                   "sig_level":sig_level, "power":power, "alternative":alternative}
    
    return params_dict

Recommended Posts

Implement R's power.prop.test function in python
Implement Enigma in python
Implement recommendations in Python
Implement XENO in python
Implement sum in Python
Implement Traceroute in Python 3
Create a function in Python
Use callback function in Python
ntile (decile) function in python
Implement timer function in pygame
Implement naive bayes in Python 3.3
Implement ancient ciphers in python
Nonlinear function modeling in Python
Draw implicit function in python
Implement Redis Mutex in Python
Implement extension field in Python
Implement fast RPC in Python
Immediate function in python (lie)
Implement method chain in Python
Implement Dijkstra's Algorithm in python
Implement Slack chatbot in Python
Implement stacking learning in Python [Kaggle]
Function argument type definition in python
python function ①
Included notation in Python function arguments
[Python] function
Write AWS Lambda function in Python
Measure function execution time in Python
Implement the Singleton pattern in Python
I tried to implement the mail sending function in Python
Function synthesis and application in Python
Quickly implement REST API in Python
python function ②
I tried to implement PLSA in Python
Implement __eq__ etc. generically in Python class
I tried to implement permutation in Python
Precautions when pickling a function in python
Implement FIR filters in Python and C
OR the List in Python (zip function)
Collectively implement statistical hypothesis testing in Python
I tried to implement PLSA in Python 2
I tried to implement ADALINE in Python
I tried to implement PPO in Python
Quadtree in Python --2
Python in optimization
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
python enumerate function
Meta-analysis in Python
Unittest in python
Python> function> Closure
Epoch in Python
Discord in Python
[Python] Generator function
Sudoku in Python
DCI in Python
quicksort in python
nCr in python