[PYTHON] Hypothesis test for product improvement

The importance of hypothesis test and probability distribution in statistical analysis has already been explained several times, but here again the null hypothesis and the alternative hypothesis. Let's look back on.

Scenario for product improvement

Company D is developing an arithmetic unit for scientific computers. The R & D team has now created a new prototype with improved performance by improving the existing version. The company's quality control team immediately measured the software benchmarks and decided to extract and test 50 samples to see if they were really improved.
According to the quality control team, the performance score of the original old product averaged 1294 and the standard deviation was 34..The average performance score of the new product tested as a sample is 1311 and the standard deviation is 28..It was 3.

Null hypothesis and alternative hypothesis

If you listen only to the story, the performance score has improved, so I think that the product has certainly improved. At this time, the following ** null hypothesis ** holds.

"The average performance of new products is equal to the average of old products."

In our sense, we would like to hypothesize that the average performance of new products is really better than that of old products. Statistical hypothesis testing makes a meaningful hypothesis when it is rejected (= denied).

In other words, if the null hypothesis is rejected, it is not equal, that is, it can be said in the positive sense that the new product has certainly been improved. On the contrary, if it is not rejected, it means that the sample of the new product and the old product are not equal. I don't know if that doesn't really improve the performance, but it's correct to say that at least it's not improved in this experiment.

On the other hand, the hypothesis that "the average performance of the new product is really better than that of the old product" as in the above example is called the ** alternative hypothesis **.

Test the hypothesis

I did it previously, but again with SciPy t test .

The scores for each product received from the R & D team were as follows.

#Old product group
[ 1225.95543492  1313.6427203   1255.29559405  1245.89449916  1366.75762258
  1327.53242061  1317.92790831  1324.61493269  1265.29687633  1328.31664814
  1261.87166693  1267.1872685   1308.34491084  1298.87127779  1297.86204665
  1245.68834845  1277.92232162  1318.1037024   1317.6412105   1321.97106981
  1376.45531456  1300.69798728  1293.57249855  1252.72982576  1307.78459733
  1308.73137839  1305.15108854  1281.34013092  1299.69826184  1347.69776592
  1252.48079949  1285.19555021  1271.30831279  1264.09883356  1309.92019558
  1275.0874674   1365.35342566  1263.27713759  1303.39574014  1294.24464261
  1293.56856821  1336.95824401  1291.61986512  1275.92673335  1331.23147617
  1266.5493744   1350.91634825  1298.22788355  1339.36570452  1355.4465444 ]

#New product group
[ 1354.13405911  1323.75265515  1277.60453412  1327.83291747  1349.05822437
  1272.68414964  1307.47711383  1379.03552722  1258.5028792   1328.53923338
  1363.80040966  1273.70734254  1326.38009765  1323.89588985  1327.32084927
  1311.6073846   1324.9257883   1285.28367883  1281.79079995  1336.87973377
  1327.11775168  1275.35676837  1266.37666597  1290.45032715  1312.39184943
  1296.47809079  1342.23383962  1310.94699159  1303.78171421  1296.65505569
  1342.84984941  1296.4890814   1357.35004255  1276.81169935  1283.04973271
  1292.6973255   1310.64071015  1310.07473863  1315.06180632  1268.3989793
  1294.0418435   1355.21947184  1293.42257727  1257.01667603  1286.30458648
  1286.74731659  1303.56261411  1336.33192992  1290.53467814  1328.87278939]

code

t, p = stats.ttest_rel(old, new)
print( "t value is%(t)s" %locals() )
print( "The probability is%(p)s" %locals() )

if p < 0.05:
    print("There is a significant difference")
else:
    print("There is no significant difference")

The t value is -1.503290038513141 The probability is 0.139182542398 There is no significant difference have become.

Summary

I reorganized the null hypothesis and the alternative hypothesis and tested the quality improvement of the product. The null hypothesis is a confusing point, so make sure you understand it correctly.

reference

Introduction to Statistics http://ruby.kyoto-wu.ac.jp/~konami/Text/

Scipy: High-level scientific and technological calculations http://turbare.net/transl/scipy-lecture-notes/intro/scipy.html

Statistical functions (scipy.stats) http://docs.scipy.org/doc/scipy/reference/stats.html

Recommended Posts

Hypothesis test for product improvement
Bayesian statistics hypothesis test
Test automation for work
For the G test 2020 # 2 exam
Test code for evaluating decorators
Python template for Codeforces-manual test-
Hypothesis test and probability distribution
Test methods for customizing Pickle behavior
Created AtCoder test tool for Python