[PYTHON] To know which bin a given value goes into when you have a bin delimiter in ndarray

Summary

When the bin separators are in ndarray: bin_edges in ascending order, which bin the value x is in

get_bin_idx.py


bin_idx = np.where((bin_edges[:-1] <= x) & (x < bin_edges[1:]))[0][0]

Obtained at.

Postscript (2016/03/16):

get_bin_idx2.py


bin_idx = np.searchsorted(bin_edges, x) - 1

But you can do the same (thanks to termoshtt).

Postscript (2017/01/02): further

get_bin_idx3.py


bin_idx = np.digitize(x, bin_edges) - 1

But you can do the same. In fact, according to the Numpy documentation, the implementation of np.digitize () seems to be np.searchsorted () itself. These differences are simply assumed to be used in np.digitize (x, bin_edges) to" find the index of the bin when each element of x is sorted into the bin defined by bin_edges ". On the other hand, np.searchsorted (y, x) is supposed to be used for "finding the index when inserting each element of x into the sorted array y". It seems.

Example

Make a suitable bin with random numbers

gen_bins.py


bin_edges = np.hstack(([0., 1.], np.random.rand(9)))
bin_edges.sort()

To illustrate the relationship between the value of 0 <x <1 and the bin

plot_bins.py


plt.step(bin_edges, np.arange(len(bin_edges)), where='post')
plt.ylim(-0.5, 9.5)
plt.xlabel('x')
plt.ylabel('Bin index')

bins.png

To know which bin to enter for a suitable x

get_and_plot_bin_idx.py


x = np.random.rand()
bin_idx = np.where((bin_edges[:-1] <= x) & (x < bin_edges[1:]))[0][0]
plt.axvline(x, c='r')
plt.axhline(bin_idx, c='r')

bins2.png

Again with another x

get_and_plot_bin_idx_again.py


x = np.random.rand()
bin_idx = np.where((bin_edges[:-1] <= x) & (x < bin_edges[1:]))[0][0]
plt.axvline(x, c='g')
plt.axhline(bin_idx, c='g')

bins3.png

that's all.

Recommended Posts

To know which bin a given value goes into when you have a bin delimiter in ndarray
When you want to plt.save in a for statement
[Small story] A painstaking measure when you have to execute a function before import in Python
When you want to replace a column with a missing value (NaN) column by column
[Python] What to check when you get a Unicode Decode Error in Django
I didn't have to write a decorator in the class Thank you contextmanager
Python Note: When assigning a value to a string
How to remember when you forget a word