[PYTHON] Cosine similarity matrix? You can get it right away with NumPy

Conclusion

def cos_sim_matrix(matrix):
    """
    item-given a feature matrix
Function to find the cosine similarity matrix between items
    """
    d = matrix @ matrix.T  # item-A matrix whose elements are the inner products of vectors

    #Each item to put in the denominator of cosine similarity-Square root of vector size
    norm = (matrix * matrix).sum(axis=1, keepdims=True) ** .5
    
    #Divide by the square root of the size of each item (somewhat smart!)
    return d / norm / norm.T

Introduction

Do you guys like the cosine similarity matrix? I love: innocent :.

Especially when trying to perform collaborative filtering, I think we often try to make it.

Unfortunately, when I look it up in Japanese, I often see how to find the cosine similarity between two vectors, but I rarely see how to find the cosine similarity matrix that is a collection of them.

Or it may be obvious to someone with a little more numpy power, so I didn't specify it.

Therefore, I will leave it here as a memorandum for myself.

The mistakes I made by then

Please do not refer to this as it is a failure: cry :. It takes dozens of times longer than the above: poop :.

@numba.jit('f8(f8[:],f8[:])', nopython=True)
def _cos_sim(v1, v2):
    """
Returns the cosine similarity of two vectors
    """
    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

@numba.jit('f8[:, :](f8[:,:])',nopython=True)
def item_similarities(item_user_matrix):
    """
Given an item user matrix
Function to find the similarity matrix between items
    """
    n = item_user_matrix.shape[0]  # n: item counts
    sims = np.identity(n)  #Similarity between the same items is 1

    for i in range(n):
        for j in range(i+1, n):
            sim = _cos_sim(item_user_matrix[i], item_user_matrix[j])
            sims[i][j] = sim
            sims[j][i] = sim
    return sims

Finally

I need more power ... !! (I need more power ... !!)

Recommended Posts

Cosine similarity matrix? You can get it right away with NumPy
Matrix concatenation with Numpy
Try matrix operation with NumPy
It seems that you can now write gate books with blueqat
Python | What you can do with Python
[Python] To get started with Python, you must first make sure you can use Python.
What to do if you get a TypeError with numpy min, max
You can do it with Python! Structural analysis of two-dimensional colloidal crystals