[PYTHON] Test method for size difference between groups

http://gingi99.hatenablog.com/entry/2019/08/09/212415

I saw. This article provided an easy-to-understand explanation of the details of Sample Ratio Mismatch.

The Sample Ratio Mismatch looks like this:

When I investigated why this happened, it was because the number of Treatment users assigned to the A / B test was significantly smaller than the number of Control users after the start of the experiment. After identifying the cause of this (the reason will come later) and verifying it correctly, we were able to detect the positive difference we expected. This phenomenon is called Sample Ratio Mismatch (SRM). In other words, the two populations of Control / Treatment assigned for A / B testing did not collect the sample size in the expected ratio, leading to erroneous results.

I would like to test whether the sample sizes were collected in the expected ratio in the two populations of Control / Treatment.

This article deals with how to test that the A / B test group A and group B are 1: 1. The test method is ** "goodness of fit test" **.

https://bellcurve.jp/statistics/course/9494.html

The test can be done by using scipy.stats.chisquare in Python's scipy library. it can.

I'm trying to do an A / B test as an example Assuming that Group A has 10,000 users and Group B has 9,900 users, verify that this is the expected ratio of 1: 1.

import scipy
a_num = 10000
b_num = 9900
expect = (a_num + b_num)/2
observed_values=scipy.array([a_num, b_num])
expected_values=scipy.array([expect, expect])
scipy.stats.chisquare(observed_values, f_exp=expected_values)

Doing this will give you the result Power_divergenceResult (statistic = 0.5025125628140703, pvalue = 0.4783981994489356).

The statistic is 0.503 and the p-value is 0.47. If the significance level is 5%, it cannot be rejected, so it can be seen that there is no difference between the samples in groups A and B.

Try using data that seems to make a difference.

a_num = 10000
b_num = 9000
expect = (a_num + b_num)/2
observed_values=scipy.array([a_num, b_num])
expected_values=scipy.array([expect, expect])
scipy.stats.chisquare(observed_values, f_exp=expected_values)

It becomes Power_divergenceResult (statistic = 52.63157894736842, pvalue = 4.023672190684225e-13), and the p-value is less than 0.05, which means that there is a "difference", that is, SRM is occurring.

When doing with JavaScript

Use JS's statistical library jstat.

It is necessary to calculate the chi-square and refer to the distribution table.

const jStat = require('jstat');

const a = 10000
const b = 9000
const expected = (a+b)/2

const statistics = (a - expected)**2 / expected + (b - expected)**2 / expected
// 52.63157894736842

const df = 1
const pValue = 1 - jStat.chisquare.cdf(chi_2, df)
// 4.0234482412415673e-13

reference

Online tools https://www.gigacalculator.com/calculators/chi-square-calculator.php?test=goodnessoffit&data=15752+0.5%0D%0A15257+0.5

Diagnosing Sample Ratio Mismatch in Online Controlled Experiments: A Taxonomy and Rules of Thumb for Practitioners

Some tests https://toukei.link/programmingandsoftware/statistics_by_python/chisqtest_by_python/