[PYTHON] Make a distance matrix

Distance between points

In Numpy, using loops is slow, so various techniques are used. It's easy to write in vector or matrix format, but there are some that use broadcast. An example is the generation of a distance matrix. Given that the locations of multiple locations are stored in x, how can we calculate the distance between them?

Loop for the time being

I've been familiar with Fortran for many years, so I want to write using loops.

do j = 1, n
  do i = 1, n
    r(i, j) = sqrt((x(i) - x(j))**2)
  end do
end do

Translated literally into Python

for i in range(len(x)):
    for j in range(len(x)):
        r[i, j] = math.sqrt((x[i] - x[j])**2)

It will be. Even with ndarray, there is no difference if the length is about 10, but rather it is faster to keep the list.

broadcast

Numpy's array operations (including products) are basically element-by-element. Operations on arrays of the same shape, such as 4x3 two-dimensional arrays, are performed on each element. For items with different sizes, such as a 4x3 2D array and a 1x3 1D array, the calculation is performed assuming that the rows of the 1D array are repeated so that the sizes are the same. This conversion is called "broadcast".

For the x that stores the position, it seems that an array will be returned if the operation with the transpose is performed, but that is not the case. It will be regarded as the same shape and will be a list of 0.

Increase dimensions

To use broadcast to generate the desired distance matrix without loops, increase one dimension. Use np.newaxis to increase the dimension. The same is true for None. The length of the dimension for which np.newaxis or None is specified is 1. Although I have never used it, Fortran 95 has a built-in function called spread.

np.sqrt((x - x[:, np.newaxis])**2)

It may be an operation of nx1 and 1xn.

np.sqrt((x[np.newaxis, :] - x[:, np.newaxis])**2)

In my environment, when measured with % timeit for 10 points, the loop of ndarray was 69.1 μs, the loop of list was 54.1 μs, and the one that avoided the loop and increased the dimension of the matrix was 3.04 μs. At 1000 points, the loop of ndarray was 660 ms, and the one with increased matrix dimensions was 2.47 ms.

n-dimensional

Using ... is the same as arranging : so that the number of dimensions is the dimension of the array. The Euclidean distance of a point in n-dimensional space can be calculated as follows.

np.sqrt(((x[..., np.newaxis, :] - x[..., :, np.newaxis])**2).sum(axis=0))

reference

Recommended Posts

Make a distance matrix
Make a squash game
Make a function decorator
I'll make a password!
Make a Nyan button
Make a Tetris-style game!
Make a Base64 decoder
Let's make a Discord Bot.
Make a Blueqat backend ~ Part 1
[Django] Make a pull-down menu
Make a LINE BOT (chat)
Make a bookmarklet in Python
Make a fortune with Python
Make Responder a daemon (service)
Let's make a rock-paper-scissors game
Make a fire with kdeplot
Make a math drill print
Let's make a remote rumba [Hardware]
How to make a Japanese-English translation
Make a Santa classifier from a Santa image
Let's make a remote rumba [Software]
Make a Tweet box for Pepper
Make a sound with Jupyter notebook
Draw a scatterplot matrix in python
Make a face recognizer using TensorFlow
How to make a slack bot
Let's make a breakout with wxPython
Let's make a spot sale service 1
How to make a crawler --Advanced
How to make a recursive function
Make C compilation a little easier
python / Make a dict from a list.
[Python] Make the function a lambda function
Make a recommender system with python
How to make a deadman's switch
[Blender] How to make a Blender plugin
Make Flask a Cloud Native application
Make a filter with a django template
Let's make a graph with python! !!
Let's make a supercomputer with xCAT
How to make a crawler --Basic
Make a model iterator with PySide
When creating a matrix in a list
Make a nice graph with plotly
Make a curtain generator in Blender
[Python] How to make a matrix of repeating patterns (repmat / tile)
Let's make a spot sale service 3
Let's make a shiritori game with Python
Make a video player with PySimpleGUI + OpenCV
Try to make a kernel of Jupyter
Make Jupyter Notebook a service on CentOS
Make a Notebook Pipeline with Kedro + Papermill
Make a partially zoomed figure with matplotlib
Make a drawing quiz with kivy + PyTorch
Let's make a voice slowly with Python
Make a cascade classifier with google colaboratory
Let's make a simple language with PLY 1
Make Qt for Python app a desktop app
Do you make something like a rocket?
Make a logic circuit with a perceptron (multilayer perceptron)
Make a Yes No Popup with Kivy