Click here until yesterday
You will become an engineer in 100 days-Day 59-Programming-Algorithms
You will become an engineer in 100 days --- Day 53 --Git --About Git
You will become an engineer in 100 days --Day 42 --Cloud --About cloud services
You will become an engineer in 100 days --Day 36 --Database --About the database
You will be an engineer in 100 days --Day 24 --Python --Basics of Python language 1
You will become an engineer in 100 days --Day 18 --Javascript --JavaScript basics 1
You will become an engineer in 100 days --Day 14 --CSS --CSS Basics 1
You will become an engineer in 100 days --Day 6 --HTML --HTML basics 1
Data structures are important for learning algorithms. Since the algorithm is a procedure, I said what kind of data and how to manipulate it. The structure of the data is also an important item in the algorithm.
Python list type is an array type data type using an index You can access them in order.
List type can store various data as elements You can add, change, replace, and delete elements.
#List definition
l = [1,2,3,4,5]
print(l)
#Element reference
print(l[0])
#Change elements
l[1] = 8
print(l)
#Add element
l.append(6)
print(l)
#Delete element
l.pop(3)
print(l)
#Insert element
l.insert(4,9)
print(l)
#Swap elements
l[0], l[3] = l[3], l[0]
print(l)
[1, 2, 3, 4, 5] 1 [1, 8, 3, 4, 5] [1, 8, 3, 4, 5, 6] [1, 8, 3, 5, 6] [1, 8, 3, 5, 9, 6] [5, 8, 3, 1, 9, 6]
Stack
and queue
are also one of the ways to handle data.
stack
-Data can be added from the beginning of the column, and data can be retrieved from the beginning of the column.
・ Last-in-first-out LIFO (Last-In-First-Out)
-Putting data on the stack is called push
, and taking it out is called pop
.
queue
-Add data from the end of the column, retrieve data from the beginning of the column
・ First-in first-out (Tokoroten method) FIFO (First-In-First-Out)
-Putting data in a queue is called enqueue
, and retrieving it is called dequeue
.
Reference: wikipedia
In Python, it can be realized by list type or ç.
stack
stack = []
#push
stack.append(1)
stack.append(2)
stack.append(3)
print(stack)
#pop
stack.pop()
print(stack)
stack.pop()
print(stack)
[1, 2, 3] [1, 2] [1]
queue
from collections import deque
#Queue definition
queue = deque(["A","B","C"])
print(queue)
#Enqueue
queue.append("D")
print(queue)
#Decue
queue.popleft()
print(queue)
deque(['A', 'B', 'C']) deque(['A', 'B', 'C', 'D']) deque(['A', 'B', 'C'])
What is a heap? A child element is always larger or equal (or always smaller or equal) than a parent element. It is a tree-structured data with the constraint.
When we simply say heap
, we often refer to the binary heap
using a binary tree.
Reference: wikipedia
The heap can be implemented in Python with the heapq library. You can quickly retrieve the minimum (or maximum) value in the heap.
import heapq
#Create a list
heap = [1, 6, 8, 0, -1]
print(heap)
#Heap an existing list(minimum value)
heapq.heapify(heap)
print(heap)
#Add element
heapq.heappush(heap, 10)
print(heap)
#Remove minimum value
print(heapq.heappop(heap))
print(heap)
max_heap = [3,7,5,9,-2]
print(max_heap)
#Create heap with maximum value
heapq._heapify_max(max_heap)
print(max_heap)
#Remove maximum value
print(heapq._heappop_max(max_heap))
print(max_heap)
[1, 6, 8, 0, -1] [-1, 0, 8, 1, 6] [-1, 0, 8, 1, 6, 10] -1 [0, 1, 8, 10, 6] [3, 7, 5, 9, -2] [9, 7, 5, 3, -2] 9 [7, 3, 5, -2]
The sort algorithm
is an algorithm that sorts.
There are many variations to simply sort.
Stable stable
sorting is when there are multiple records with the same key in the data
After sorting, the order of the records before sorting is maintained.
Sorting so that the positional relationship before sorting is broken by sorting It is called an unstable ʻunstable` sort.
It is desirable to adopt a sorting algorithm that finishes sorting quickly and is stable.
In the sorting algorithm, the amount of calculation is also a guide. The amount of calculation of the sort algorithm that is generally said is as follows.
algorithm | Stability | Average time | Worst time | Best time | Worst space |
---|---|---|---|---|---|
Bubble sort | Stable | ||||
Selection sort | unstable | ||||
Insertion sort | Stable | ||||
Heapsort | unstable | ||||
Merge sort | Stable | ||||
Quick sort | unstable | ||||
python sort(Tim sort) | Stable |
Now let's look at various sorting algorithms. I have posted an implementation example, but I think there is a better way to write it.
Bubble sort
compares the values of two adjacent elements in a list
It is an alignment algorithm that performs exchange according to conditions.
Sort the list in descending order (descending order) or ascending order (ascending order).
#Bubble sort
def bubble_sort(data):
for i in range(len(data)):
for j in range(len(data) - i - 1):
if data[j] > data[j + 1]:
data[j], data[j + 1] = data[j + 1], data[j]
return data
Selection sort
looks for the element with the minimum (maximum) value of the array
It is an algorithm that performs alignment by exchanging it with the first element of the array.
#Selection sort
def select_sort(data):
for i in range(len(data)):
min = i
for j in range(i + 1, len(data)):
if data[min] > data[j]:
min = j
data[i], data[min] = data[min], data[i]
return data
Insert sort
is by inserting a new element in the appropriate position for the aligned part of the list.
An algorithm for alignment.
#Insertion sort
def insert_sort(data):
for i in range(1, len(data)):
temp = data[i]
j = i - 1
while (j >= 0) and (data[j] > temp):
data[j + 1] = data[j]
j -= 1
data[j + 1] = temp
return data
Heapsort
is a sorting algorithm that sorts the list using the binary heap tree
.
Take elements from the unsorted list and add them to the heap in order. Repeat the following until you add all the elements (take the route and add it to the sorted list)
#Heapsort
def heap_sort(data):
for i in range(len(data)):
j = i
while (j > 0) and (data[(j - 1) // 2] < data[j]):
data[(j - 1) // 2], data[j] = data[j], data[(j - 1) // 2]
j = (j - 1) // 2
for i in range(len(data), 0, -1):
data[i - 1], data[0] = data[0], data[i - 1]
j = 0
while ((2 * j + 1 < i - 1) and (data[j] < data[2 * j + 1]))\
or ((2 * j + 2 < i - 1) and (data[j] < data[2 * j + 2])):
if (2 * j + 2 == i - 1) or (data[2 * j + 1] > data[2 * j + 2]):
data[j], data[2 * j + 1] = data[2 * j + 1], data[j]
j = 2 * j + 1
else:
data[j], data[2 * j + 2] = data[2 * j + 2], data[j]
j = 2 * j + 2
return data
In python, using the library heapq that realizes heap queue You can also easily write a heapsort.
import heapq
#Heapsort using heapq
def heap_sort2(data):
h = data.copy()
heapq.heapify(h)
return [heapq.heappop(h) for _ in range(len(h))]
Merge sort
recursively divides the array you want to sort and merges again`
It is a sorting algorithm that tries to realize sorting.
#Merge sort
def merge_sort(data):
if len(data) <= 1:
return data
m = len(data) // 2
l = merge_sort(data[:m])
r = merge_sort(data[m:])
return merge(l , r)
def merge(left, right):
res , i , j = [], 0, 0
while (i < len(left)) and (j < len(right)):
if left[i] <= right[j]:
res.append(left[i])
i += 1
else:
res.append(right[j])
j += 1
if i < len(left):
res.extend(left[i:])
if j < len(right):
res.extend(right[j:])
return res
The feature of quicksort
is that the number of data comparisons and exchanges is very small.
You can efficiently sort data that is scattered randomly,
It is an algorithm that is not stable.
def quick_sort(data):
def quick(list, left, right):
pl = left
pr = right
y = list[(pl + pr) // 2]
while True:
while list[pl] < y: pl += 1
while list[pr] > y: pr -= 1
if pl <= pr:
list[pl], list[pr] = list[pr], list[pl]
pl += 1
pr -= 1
if pl > pr:
break
if left < pr: quick(list, left, pr)
if pl < right: quick(list, pl, right)
return data
quick(data, 0, len(data) - 1)
return data
Python's standard sort is (probably) Timsort
.
Timsort
is a stable hybrid derived from merge sort
and insertion sort
The sorting algorithm is designed to work well with different types of data.
def python_sort(data):
data.sort()
return data
Let's measure the execution time of the above sorting algorithm. You can easily measure with the following code.
Sorts 10000 numbers.
With jupyter notebook
, you can measure the time by using% time
.
import numpy as np
import sys
sys.setrecursionlimit(20000)
data = np.arange(10000)
np.random.shuffle(data)
data = list(data)
print('bubble_sort')
%time l = bubble_sort(data)
print('select_sort')
%time l = select_sort(data)
print('insert_sort')
%time l = insert_sort(data)
print('heap_sort')
%time l = heap_sort(data)
print('heap_sort2')
%time l = heap_sort2(data)
print('merge_sort')
%time l = merge_sort(data)
print('quick_sort')
%time l = quick_sort(data)
print('python_sort')
%time l = python_sort(data)
bubble_sort CPU times: user 18 µs, sys: 0 ns, total: 18 µs Wall time: 20 µs select_sort CPU times: user 18 µs, sys: 1 µs, total: 19 µs Wall time: 21 µs insert_sort CPU times: user 7 µs, sys: 0 ns, total: 7 µs Wall time: 10 µs heap_sort CPU times: user 38 µs, sys: 1 µs, total: 39 µs Wall time: 40.1 µs heap_sort2 CPU times: user 11 µs, sys: 0 ns, total: 11 µs Wall time: 12.9 µs merge_sort CPU times: user 31 µs, sys: 1e+03 ns, total: 32 µs Wall time: 33.1 µs quick_sort CPU times: user 13 µs, sys: 1 µs, total: 14 µs Wall time: 14.8 µs python_sort CPU times: user 4 µs, sys: 0 ns, total: 4 µs Wall time: 6.2 µs
Well, the shortest time is Python's standard sort.
I think that this will change considerably depending on how you implement it. I think it's quite difficult to achieve speed that exceeds standard sorting. Normally, you don't need to write your own program for sorting.
Python sorting is so powerful that you may not usually care about it. There are few problems with sorting.
How is sorting done? If you compare how each method is different You will understand the depth of the algorithm.
40 days until you become an engineer
Otsu py's HP: http://www.otupy.net/
Youtube: https://www.youtube.com/channel/UCaT7xpeq8n1G_HcJKKSOXMw
Twitter: https://twitter.com/otupython
Recommended Posts