[PYTHON] Calculation of the number of Klamer correlations

I couldn't find a module to calculate the number of Klamer correlations in Python, so I made it myself. For the calculation method, I referred to here.

cramerV.py


# -*- coding: utf-8 -*-

import numpy as np


def cramersV(x, y):
    """
    Calc Cramer's V.

    Parameters
    ----------
    x : {numpy.ndarray, pandas.Series}
    y : {numpy.ndarray, pandas.Series}
    """
    table = np.array(pd.crosstab(x, y)).astype(np.float32)
    n = table.sum()
    colsum = table.sum(axis=0)
    rowsum = table.sum(axis=1)
    expect = np.outer(rowsum, colsum) / n
    chisq = np.sum((table - expect) ** 2 / expect)
    return np.sqrt(chisq / (n * (np.min(table.shape) - 1)))


if __name__ == "__main__":
    import pandas as pd
    data = pd.DataFrame(
        {'science': ['like', 'like', 'like', 'like', 'like', 'like', 'like',
                     'like', 'like', 'like', 'like', 'like', 'like', 'like',
                     'like', 'like', 'like', 'like', 'like', 'like', 'like',
                     'like', 'like', 'like', 'like', 'like', 'like', 'like',
                     'like', 'like', 'like', 'dislike', 'dislike', 'dislike',
                     'dislike', 'dislike', 'dislike', 'dislike', 'dislike',
                     'dislike', 'dislike', 'dislike', 'dislike', 'dislike',
                     'dislike', 'dislike', 'dislike', 'dislike', 'dislike',
                     'dislike'],
         'math': ['like', 'like', 'like', 'like', 'like', 'like', 'like',
                  'like', 'like', 'like', 'like', 'like', 'like', 'like',
                  'like', 'like', 'like', 'like', 'like', 'like', 'like',
                  'like', 'like', 'like', 'dislike', 'dislike', 'dislike',
                  'dislike', 'dislike', 'dislike', 'dislike', 'like', 'like',
                  'like', 'like', 'like', 'like', 'dislike', 'dislike',
                  'dislike', 'dislike', 'dislike', 'dislike', 'dislike',
                  'dislike', 'dislike', 'dislike', 'dislike', 'dislike',
                  'dislike']})
    print cramersV(data['science'], data['math'])

Recommended Posts

Calculation of the number of Klamer correlations
10. Counting the number of lines
Get the number of digits
Calculate the number of changes
Calculation of the minimum required number of votes from turnout
Get the number of views of Qiita
Get the number of Youtube subscribers
Count / verify the number of method calls.
Count the number of characters with echo
About the accuracy of Archimedean circle calculation method
Calculate the total number of combinations with python
Divide the string into the specified number of characters
Find the number of days in a month
Minimize the number of polishings by combinatorial optimization
Determine the number of classes using the Starges formula
Calculation of the shortest path using the Monte Carlo method
The beginning of cif2cell
python beginners tried to predict the number of criminals
How to know the port number of the xinetd service
Experience the good calculation efficiency of vectorization in Python
[Python] A program that counts the number of valleys
Projecet Euler 12 Find the number of divisors without division.
How to get the number of digits in Python
The meaning of self
the zen of Python
Try to estimate the number of likes on Twitter
The story of sys.path.append ()
Predict the number of people infected with COVID-19 with Prophet
Get the size (number of elements) of UnionFind in Python
[Summary of 27 languages] My number check digit calculation method
Manage the package version number of requirements.txt with pip-tools
[Python] Get the number of views of all posted articles
Bayesian inference concept (3) ... Calculation of change points in the number of emails received by PyMC3
Visualize the number of complaints from life insurance companies
Revenge of the Types: Revenge of types
Clustering G-means that automatically determines the number of clusters
How to find the optimal number of clusters in k-means
Maya | Find out the number of polygons in the selected object
Examine the margin of error in the number of deaths from pneumonia
[Python] Heron's formula functionalization and calculation of the maximum area
Analyzing data on the number of corona patients in Japan
Count the number of characters in the text on the clipboard on mac
Get the number of specific elements in a python list
Python --Find out number of groups in the regex expression
How to calculate the amount of calculation learned from ABC134-D
Try to improve the accuracy of Twitter like number estimation
[Homology] Count the number of holes in data with Python
How to increase the number of machine learning dataset images
Let's visualize the number of people infected with coronavirus with matplotlib
[Nonparametric Bayes] Estimating the number of clusters using the Dirichlet process
Get the number of occurrences for each element in the list
Numerical approximation method when the calculation of the derivative is troublesome
Align the version of chromedriver_binary
Scraping the result of "Schedule-kun"
The story of building Zabbix 4.4
[Apache] The story of prefork
Compare the fonts of jupyter-themes
About the ease of Python
Explain the code of Tensorflow_in_ROS
Reuse the results of clustering
Calculation of similarity by MinHash