[PYTHON] What is bucket sort? Merideme and code example

What is bucket sort?

A sorting method that counts the number of each element, specifies the array number to insert that element, and inserts them in order.

There are two arrays to prepare.

** (1) Counter array ** Count how many each element is.

** (2) Array to put the sort result ** Prepare an array with the same number of elements as the array you want to sort, and insert the number of elements specified in (1).


** ▼ Bucket sort image ** For an array of [1,3,2,1,2,1]

Prepare 1,2,3 containers (buckets), put each element in it, and count the number. (Example: 1 is 3, 2 is 2, 3 is 1)

Arrange the elements in order.

image.png

[Reference: wikipedia](https://ja.wikipedia.org/wiki/%E3%83%90%E3%82%B1%E3%83%83%E3%83%88%E3%82%BD% E3% 83% BC% E3% 83% 88)

Bucket sort merideme

·merit

Intuitive and easy to understand. The fastest sorting algorithm on a computational complexity basis.

Efficient sorting when the maximum value is small and the same number is duplicated many times.

·Demerit

If the maximum value of the original array is large, you must prepare a bucket for that maximum value.

It cannot be used under the following conditions. ・ Non-integer (including decimal point) ┗Because the bucket cannot be prepared

・ Maximum value is too large ┗ Because the memory usage of the counter array becomes huge


** ▼ Examples of disadvantages ** (explicitly understand that a useless bucket will occur)

When sorting list = [1,2,1,10] There are no 3-9, but you have to prepare a bucket for that.

bucket: 1 2 3 4 5 6 7 8 9 10
Number of elements: 2 1 0 0 0 0 0 0 0 1

The smaller the maximum value, the faster the sort, but ** the larger the maximum value, the more buckets to prepare **.


## Bucket sort code example

python


def bucket_sort(arr):
  arrc = [0]*(max(arr)+1)

  #Creating a counter array. 0 start. 0th is always 0
  for i in arr:
    arrc[i] += 1

  #Prepare an array to put the sort result
  ans=[]

  for j in range(1, len(arrc)):
    ans.extend([j]*arrc[j])
  
  return ans 

Use extend to add arrays.

Verification


list=[8,3,1,2,3,3,10,7,6,7]
bucket_sort(list)

#[1, 2, 3, 3, 3, 6, 7, 7, 8, 10]

### How to not use extend

python


def bucket_sort2(arr):
  arrc = [0]*(max(arr)+1)

  #Creating a counter array. 0 start. 0th is always 0
  for i in arr:
    arrc[i] += 1

  #Prepare an array to put the sort result
  ans=[]

  for j in range(1, len(arrc)):
    ans = ans+[j]*arrc[j]
  
  return ans

Since each bucket has the same depth, simply add them.

Verification


list=[8,3,1,2,3,3,10,7,6,7]
bucket_sort2(list)

#[1, 2, 3, 3, 3, 6, 7, 7, 8, 10]

### How to determine the insertion position from the cumulative sum Converts the counter array to a cumulative sum and finds a place to insert the specified element.

If you insert from the right, consider the case where the element is duplicated, and set the place where the element is inserted by -1.

python


def bucket_sort3(arr):
  #Create an array with as many elements as the maximum number in the array.
  #When calculating the cumulative sum-Count from 0 to refer to the element of 1.
  arrc = [0] *(max(arr)+1)

  #Count the number of elements in the array
  #Add 1 to the specified sequence number
  for i in arr:
    arrc[i] += 1

  #Convert the count to a cumulative sum.
  #Add the previous element to the current element and replace it
  for j in range(1, len(arrc)) :
    arrc[j] = arrc[j] + arrc[j-1]
  
  #Create an array to store the sort results. The number of elements in the array is the same as the original array
  arrs = [0]*len(arr)
  
  #Take out the elements of the original array one by one, find out what number they are in, and insert them.
  for i in arr:

    #Find out what number i to insert. Since 0 is not required for the sorted array,-1 Insert in place
    #If there are multiple same numbers, fill them from the rightmost side to the left side.
    arrs[arrc[i]-1] =i
   
    #Decrease the number of elements of your i by one from the cumulative sum considering the case where there are multiple same elements
    arrc[i] -= 1

  return arrs

Verification


list=[8,3,1,2,3,3,10,7,6,7]
bucket_sort3(list)

#[1, 2, 3, 3, 3, 6, 7, 7, 8, 10]

Since the teaching material introduced how to use cumulative sum, I tried to unravel the process, but honestly, it is difficult to understand intuitively.

Recommended Posts

What is bucket sort? Merideme and code example
[Python] Python and security-① What is Python?
[Python] What is pandas Series and DataFrame?
What is the difference between `pip` and` conda`?
What are you comparing with Python is and ==?
What is equal repayment of principal and interest and equal repayment of principal and interest?
What is the difference between Unix and Linux?
What is namespace
What is copy.copy ()
What is Django? .. ..
What is POSIX?
What is Linux
What is klass?
What is SALOME?
What is Linux?
What is python
What is hyperopt?
What is Linux
What is pyvenv
What is __call__
What is Linux
What is Python
What is the difference between usleep, nanosleep and clock_nanosleep?
What is the true identity of Python's sort method "sort"? ??
What is pip and how do you use it?
Check what the character code is for all files under the directory that is Python and output