[PYTHON] Calculation of Spearman's rank correlation coefficient

What is Spearman's Rank Correlation Coefficient?

An index showing the correlation between the two ranking data. For details, refer to the URL below. Wiki: [Spearman's Rank Correlation Coefficient](https://ja.wikipedia.org/wiki/%E3%82%B9%E3%83%94%E3%82%A2%E3%83%9E%E3%83 % B3% E3% 81% AE% E9% A0% 86% E4% BD% 8D% E7% 9B% B8% E9% 96% A2% E4% BF% 82% E6% 95% B0) Toki no Mori Wiki: [Spearman Rank Correlation Coefficient](http://ibisforest.org/index.php?Spearman%E9%A0%86%E4%BD%8D%E7%9B%B8%E9%96%A2 % E4% BF% 82% E6% 95% B0)

There are several formulas, but this time we will use this formula.

式

Also, I think you should use the one at the following URL to match the answers of the created program. The rank correlation coefficient is famous, so if you look it up, you can find other samples. Introduction to Spearman's Rank Correlation Coefficient Statistics

Calculation program

spearman.py


def spearman(list_a, list_b):
    N = len(list_a)                                                          
    return 1 - ((6 * sum(map(lambda a, b: (a - b) ** 2, \
    list_a, list_b) / float(N ** 3 - N) )

You can easily calculate like this. The argument list creates a sequence like [1,2,3 ...]. Normally, you can create a List of two sequences and pass it with zip. Using numpy eliminates the comprehension part and makes it simpler.

spearman_numpy.py


import numpy
def spearman(array_a, array_b):
    N = len(array_a)
    return 1 - (6 * sum((array_a - array_b) ** 2))  / float(N**3 - N)

Since there was a mistake, I reflected owdowt's comment. Thank you very much. [Correction date: 19/02/26]

This one is simpler and better. The argument creates a sequence like numpy.array ([1,2,3 ...]). When using a sequence in python, it is better to use numpy.

Exception handling

I haven't done anything about exception handling this time. Also, if the sequence has the same order, the calculation formula will be different, so refer to the URL introduced at the beginning.

Recommended Posts

Calculation of Spearman's rank correlation coefficient
[Python] Calculation of Kappa (k) coefficient
[Python] Calculation of image similarity (Dice coefficient)
Time comparison: Correlation coefficient calculation in Python
Rethink the correlation coefficient
[Introduction to Scipy] Calculation of Lorenz curve and Gini coefficient ♬
Calculation of similarity by MinHash
About cost calculation of MeCab