In this article, we will give answers in Python to topics # 7 to # 12 of the ALDS1 course (Introduction to Algorithms and Data Structures) of the spiral book AOJ.
This is a continuation of the previous article Spiral book in Python, AOJ's answer (ALDS1 # 1 ~ # 6). Please refer to this article for what the spiral book AOJ is.
-[Topic # 7 Tree Structure](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF-7- % E6% 9C% A8% E6% A7% 8B% E9% 80% A0) -[Topic # 8 Binary Search Tree](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF-8 -% E4% BA% 8C% E5% 88% 86% E6% 8E% A2% E7% B4% A2% E6% 9C% A8) -[Topic # 9 Heap](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF-9-% E3% 83% 92% E3% 83% BC% E3% 83% 97) -[Topic # 10 Dynamic Programming](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF- 10-% E5% 8B% 95% E7% 9A% 84% E8% A8% 88% E7% 94% BB% E6% B3% 95) -[Topic # 11 Graph I](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF-11- % E3% 82% B0% E3% 83% A9% E3% 83% 95i) -[Topic # 12 Graph II](https://qiita.com/nakasan/items/9ed9c858b6473fef49e2#%E3%83%88%E3%83%94%E3%83%83%E3%82%AF-12- % E3% 82% B0% E3% 83% A9% E3% 83% 95ii)
ALDS1_7_A: Rooted Trees
class Node:
def __init__(self, pa=-1, chs=None):
self.pa = pa
self.chs = chs
n = int(input())
tree = {id:Node() for id in range(n)}
for _ in range(n):
id, _, *chs = map(int, input().split())
tree[id].chs = chs
for ch in chs:
tree[ch].pa = id
def set_depths(id, depth):
tree[id].depth = depth
for ch in tree[id].chs:
set_depths(ch, depth+1)
for id in tree:
if tree[id].pa == -1:
set_depths(id, 0)
break
for id, node in tree.items():
kind = "root" if node.pa == -1 else "internal node" if node.chs else "leaf"
print(f"node {id}: parent = {node.pa}, depth = {node.depth}, {kind}, {node.chs}")
--The node is represented as a class Node
that holds a reference to the parent pa
's ʻidand a reference to the children
chs to the ʻid
. Instead of referencing all the children, the node can be represented only by the reference to the leftmost child and the reference to the rightmost sibling, and any tree can be treated as a substantially binary tree. "Left-child right-sibling expression" (left-child, right-sibling representaiton) ", but I think the code is easier to read if it uses references to all children.
--The tree is represented as a dictionary with ʻidas the key and the corresponding instance of
Node` as the value.
--Since it is a problem to display the parent as -1 when there is no parent, -1 is formally treated as an id of nothing.
ALDS1_7_B: Binary Trees
class Node:
def __init__(self, chs, pa=-1):
self.chs = chs
self.pa = pa
n = int(input())
tree = {id:Node(chs=[-1,-1]) for id in range(-1, n)}
for _ in range(n):
id, *chs = map(int, input().split())
tree[id].chs = chs
for ch in chs:
tree[ch].pa = id
def set_d_and_h(id, d):
tree[id].d = d
h = 0
for ch in tree[id].chs:
if ch != -1:
h = max(h, set_d_and_h(ch, d+1) + 1)
tree[id].h = h
return h
for id in tree:
if tree[id].pa == -1:
set_d_and_h(id, 0)
break
for id, node in tree.items():
if id != -1:
sib = tree[node.pa].chs[1] if tree[node.pa].chs[0] == id else tree[node.pa].chs[0]
deg = 2 - node.chs.count(-1)
kind = "root" if node.pa == -1 else "internal node" if deg else "leaf"
print(f"node {id}: parent = {node.pa}, sibling = {sib}, degree = {deg}, depth = {node.d}, height = {node.h}, {kind}")
--When there are no children, it is expressed as -1 and input, so in order to take a loop about the child even in that case, a node whose ʻid corresponds to -1 is also added to the tree as a so-called sentinel. There is. Parent-child information for nodes with ʻid
of -1 is never used
--Depth and height can be calculated inductively, and hopefully they can be calculated at the same time like this.
ALDS1_7_C: Tree Walk
class Node:
def __init__(self, chs, pa=-1):
self.chs = chs
self.pa = pa
n = int(input())
tree = {id:Node(chs=[-1,-1]) for id in range(-1, n)}
for _ in range(n):
id, *chs = map(int, input().split())
tree[id].chs = chs
for ch in chs:
tree[ch].pa = id
def walk(id, order):
walked = ""
if id != -1:
if order == "Pre":
walked += f" {id}"
walked += walk(tree[id].chs[0], order)
if order == "In":
walked += f" {id}"
walked += walk(tree[id].chs[1], order)
if order == "Post":
walked += f" {id}"
return walked
for id in tree:
if tree[id].pa == -1:
for order in ["Pre", "In", "Post"]:
print(order + "order\n" + walk(id, order))
break
--The part above def walk
is the same as the previous question 7_B
ALDS1_7_D: Reconstruction of a Tree
_ = int(input())
pr_walk = [*map(int, input().split())]
in_walk = [*map(int, input().split())]
walked = []
def recon_tree(pr_walk, in_walk):
if pr_walk:
root = pr_walk[0]
i = in_walk.index(root)
recon_tree(pr_walk[1:i+1], in_walk[:i])
recon_tree(pr_walk[i+1:], in_walk[i+1:])
walked.append(root)
recon_tree(pr_walk, in_walk)
print(*walked)
--If you just want to display the Post order walk, you can do it during the process of reconstructing the tree, and you don't need to use the reconstructed tree, so you don't need to create an instance of the node.
In the linked list, you can insert / delete (at a specified position) with $ O (1) $, but it costs $ O (n) $ to search, and you need to search for a delete with a key, so $ O (n) It cost $. On the other hand, the binary search tree is a data structure that allows you to insert, search, and delete all with $ O (h) $ with the height of the tree as $ h $. $ h $ is $ O (n) $ in the worst case, but when you insert elements randomly to compose a tree, it is very fast at $ O (\ log n) $ on average. In-order traversal also allows nodes to be visited in key sort order at a time of $ \ Theta (n) $.
ALDS1_8_A: Binary Search Tree I The problem of inserting elements in a binary search tree. It is omitted because it is completely included in the following III problem.
ALDS1_8_B: Binary Search Tree II The problem of inserting and searching for elements in a binary search tree. It is omitted because it is completely included in the following III problem.
ALDS1_8_C: Binary Search Tree III The problem of inserting / searching / deleting elements in a binary search tree
class Node:
def __init__(self, key, left=None, right=None):
self.key = key
self.left = left
self.right = right
def insert(key):
global root
if root:
ch = root
while ch:
pa, ch = ch, ch.left if key < ch.key else ch.right
if key < pa.key:
pa.left = Node(key)
else:
pa.right = Node(key)
else:
root = Node(key)
def find(key):
node = root
while node and node.key != key:
node = node.left if key < node.key else node.right
print("yes" if node else "no")
def delete(key):
global root
pa, node = None, root
while node.key != key:
pa, node = node, node.left if key < node.key else node.right
if node.left and node.right:
pa, to_del = node, node.right
while to_del.left:
pa, to_del = to_del, to_del.left
node.key = to_del.key
else:
to_del = node
ch = to_del.left or to_del.right
if not pa:
root = ch
elif pa.left == to_del:
pa.left = ch
else:
pa.right = ch
def walk(node, order):
walked = ""
if node:
if order == "Pre":
walked += f" {node.key}"
walked += walk(node.left, order)
if order == "In":
walked += f" {node.key}"
walked += walk(node.right, order)
return walked
def show():
for order in ["In", "Pre"]:
print(walk(root, order))
root = None
cmds = {"print":show, "insert":insert, "find":find, "delete":delete}
for _ in range(int(input())):
cmd_name, *key = input().split()
cmds[cmd_name](*map(int,key))
--When there is no parent or child, the parent or child is formally treated as None
.
-Insert and search are as you can see, it is easy to understand just by going down from the root to the leaf, but deleting is a little complicated
--If there is at most one child of the node you want to delete, simply make that child (None
if there is no child) the child of the parent of the node you want to delete, so to speak," if the child inherits the trace ". It is troublesome when there are two children at the node you want to delete. For example, when you try to have the right child succeed, if the right child already has a left child, the left child will be two. Therefore, consider having another node succeed. Even after the succession, each node must appear in the sort order in the intermediate order patrol, but this can be satisfied by pulling out and replacing the node immediately before or after in the intermediate order patrol. Here, the node immediately after (the end point when the subtree on the right continues to be traced to the left) is pulled out. Since there is at most one child at the node immediately before (no right child) or immediately after (no left child), pull out by the previous method when there is at most one child.
――It can be managed without having the information of the parent node in each node. Insertion and search do not require parent node information, only when deleting, but you can retain parent information along the way until you reach the node you want to delete.
--Actually, in python, ʻA or B returns ʻA
if ʻA is true, and returns
B if ʻA
is false.
In addition, [Colmen "Algorithm Introduction 3rd Edition"](https://www.amazon.co.jp/dp/B078WPYHGN
As pointed out at the end of Chapter 12 of), this removal method can cause problems. In this deletion method, if the node you want to delete has two children, the instance of the node you want to delete is not deleted, but the key information of another node is copied to the instance of the node you want to delete. Instances of the nodes will be deleted. Therefore, if you are referencing an instance instead of a key elsewhere in your code, you may refer to a node that is different from the intended node. To solve this problem, the delete
function part can be rewritten as follows without using a copy of the key information, although it will be more complicated:
def trans(pa, node, new_node):
if not pa:
global root
root = new_node
elif pa.left == node:
pa.left = new_node
else:
pa.right = new_node
def delete(key):
pa, node = None, root
while node.key != key:
pa, node = node, node.left if key < node.key else node.right
if node.left and node.right:
pa_heir, heir = node, node.right
while heir.left:
pa_heir, heir = heir, heir.left
if pa_heir != node:
trans(pa_heir, heir, heir.right)
heir.right = node.right
trans(pa, node, heir)
heir.left = node.left
else:
trans(pa, node, node.left or node.right)
ALDS1_8_D: Treap --Since it is abbreviated in the spiral book, it is abbreviated once
A binary search tree can be inserted, searched, and deleted with the height of the tree as $ h $, all with $ O (h) $, and when a tree is constructed by randomly inserting elements, $ h $ is on average. $ O (\ log n) $ is very fast, but for the worst insertion order it will be $ O (n) $. A binary search tree that is devised to suppress the expected value or stronger worst value of the tree height to $ O (\ log n) $ for any insertion order is called a balanced binary search tree. This is especially useful for online algorithms that do not allow you to choose the insertion order.
--The treep is a binary search tree that is randomly assigned a new variable, priority, to make it a binary heap. It has the effect of effectively randomizing the insertion order, and the expected value of the tree height for any insertion order can be suppressed to $ O (\ log n) $. However, the worst value is $ O (n) $
--Self-balancing binary search trees that strongly suppress the worst value of tree height to $ O (\ log n) $ include red-black trees (two-color trees), but they are more complicated.
ALDS1_9_A: Complete Binary Tree
N = int(input())
heap = input().split()
for i in range(N):
print(f"node {i+1}: key = {heap[i]}, ", end="")
print(f"parent key = {heap[(i-1)//2]}, " if i else "", end="")
print(f"left key = {heap[2*i+1]}, " if 2*i+1 < N else "", end="")
print(f"right key = {heap[2*i+2]}, " if 2*i+2 < N else "")
--If you start the heap subscript with 0 as in this code, the subscripts of the subscript ʻi's parent, left child, and right child are
(i-1) // 2,
2 *, respectively. It becomes i + 1,
2 * i + 2. On the other hand, it is also possible to put a dummy element such as
None in the 0th position of the heap to formally start the subscript with 1, in which case the subscripts of the parent, left child, and right child are ʻi respectively. // 2
, 2 * i
, 2 * i + 1
ALDS1_9_B: Maximum Heap
If you write max_heapify
yourself,
def max_heapify(A, i):
triad = {j:A[j] if j < N else -float("inf") for j in [i, 2*i+1, 2*i+2]}
i_max = max(triad, key=triad.get)
if i != i_max:
A[i], A[i_max] = A[i_max], A[i]
max_heapify(A, i_max)
N = int(input())
heap = [*map(int, input().split())]
for i in reversed(range(N//2)):
max_heapify(heap, i)
print("", *heap)
--In max_heapify
, consider a triplet of the specified parent node ʻA [i] and its children ʻA [2 * i + 1]
, ʻA [2 * i + 2]. A child that does not exist is treated as $-\ infinty $. If the parent node is smaller than the child node, replace the parent node with the largest child node and then recursively
max_heapify` on that child node.
--max_heapify
follows the child, so the amount of calculation is on the order of the height of a complete binary tree $ O (\ log N) $. When max_heapify
is performed for all nodes, there are at most $ N / 2 ^ h $ subtrees with a height of $ h $$ (1 \ le h \ le \ log N) $, so the whole The amount of calculation is $ O (N \ sum_ {h = 1} ^ {\ log N} h / 2 ^ h) = O (N) $. Here, $ \ sum_ {h = 0} ^ \ infinty h / 2 ^ h = O (1) $ was used.
With the heapq
module:
from heapq import heapify
_, heap = input(), [-int(x) for x in input().split()]
heapify(heap)
print("", *[-x for x in heap])
--Since the heapq
module handles the minimum heap, it is negatively marked to treat the maximum heap as the minimum heap.
ALDS1_9_C: Priority Queue
If I write ʻextract or ʻinsert
by myself as below, it becomes TLE:
def max_heapify(A, i):
triad = {j:A[j] if j < len(A) else -float("inf") for j in [i, 2*i+1, 2*i+2]}
i_max = max(triad, key=triad.get)
if i != i_max:
A[i], A[i_max] = A[i_max], A[i]
max_heapify(A, i_max)
def extract(A):
print(A[0])
A[0] = A[-1]
A.pop()
max_heapify(A, 0)
def increase_key(A, i, key):
A[i] = key
while i and A[(i-1)//2] < A[i]:
A[(i-1)//2], A[i] = A[i], A[(i-1)//2]
i = (i-1)//2
def insert(A, key):
A.append(None)
increase_key(A, len(A)-1, key)
heap = []
while True:
op, *key = input().split()
if op == "end":
break
if op == "extract":
extract(heap)
else:
insert(heap, int(*key))
Apparently, it seems that you have to devise it so much that you can pass it if you give up and use the heapq
module:
from heapq import heappop, heappush
heap = []
while True:
op, *key = input().split()
if op == "end":
break
if op == "extract":
print(-heappop(heap))
else:
heappush(heap, -int(*key))
Dynamic programming includes a top-down method that uses memoization recursion and a bottom-up method that explicitly solves subproblems in ascending order.
--The top-down method has the disadvantage of requiring extra function call overhead, but has the advantage of solving only the necessary subproblems when it is not necessary to solve all the subproblems.
--On the other hand, the bottom-up method has the disadvantage that it is necessary to consider in what order the bottom-up is performed, but the amount of spatial calculation is saved when the calculation result of the partial problem can be discarded in the middle, as in the case of the Fibonacci number problem below. There is a merit that can be done
ALDS1_10_A: Fibonacci Number In the top-down method:
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n < 2:
return 1
return fib(n-1) + fib(n-2)
print(fib(int(input())))
In the bottom-up method:
x = y = 1
for _ in range(int(input())):
x, y = x + y, x
print(y)
ALDS1_10_B: Matrix Chain Multiplication
In the top-down method:
from functools import lru_cache
n = int(input())
p = [int(input().split()[0]) for _ in range(n-1)] + [*map(int, input().split())]
@lru_cache(maxsize=None)
def n_mul(i, j):
if j - i == 1:
return 0
return min(n_mul(i,k) + n_mul(k,j) + p[i]*p[k]*p[j] for k in range(i+1,j))
print(n_mul(0, n))
--Consider the minimum number of calculations for the matrix $ A_k \ in \ text {M} (p_k, p_ {k + 1}) $ product $ \ prod_ {k = 0} ^ {n-1} A_k $. n_mul (i, j)
solves the subset sum problem $ \ prod_ {k = i} ^ {j-1} A_k $
In the bottom-up method:
n = int(input())
p = [int(input().split()[0]) for _ in range(n-1)] + [*map(int, input().split())]
n_mul = [[0]*(n+1) for _ in range(n+1)]
for l in range(2, n+1):
for i in range(n-l+1):
j = i + l
n_mul[i][j] = min(n_mul[i][k] + n_mul[k][j] + p[i]*p[k]*p[j] for k in range(i+1,j))
print(n_mul[0][n])
--Fill the calculation result n_mul [i] [j]
of the partial problem in ascending order of the size $ l = j-i $ of the partial problem.
ALDS1_10_C: Longest Common Subsequence
In Python, the following natural method will TLE in the middle, whether it is top-down or bottom-up. Both time complexity
In the top-down method:
import sys
from functools import lru_cache
sys.setrecursionlimit(100000)
for _ in range(int(input())):
x, y = input(), input()
@lru_cache(maxsize=None)
def lcs(i, j):
if i == 0 or j == 0:
return 0
if x[i-1] == y[j-1]:
return lcs(i-1, j-1) + 1
return max(lcs(i-1,j), lcs(i,j-1))
print(lcs(len(x), len(y)))
--Since recursion exceeds the maximum number of recursion processes, the maximum number of recursion processes is reset to an appropriate large value (100000).
In bottom-up format:
for _ in range(int(input())):
x, y = input(), input()
l = [[0] * (len(y)+1) for _ in range(len(x)+1)]
for i in range(len(x)):
for j in range(len(y)):
l[i+1][j+1] = l[i][j] + 1 if x[i] == y[j] else max(l[i][j+1], l[i+1][j])
print(l[-1][-1])
In bottom-up formati
Each in a loop abouti
Againstl[i+1][:]
I'm calculating, but this is just beforei
Calculation result inl[i][:]
I only use it. for that reason,l[i][:]
Temporary variablesl[:]
Store ini
Updated every time, pasti
Inl[i][:]
By discarding the values of, the amount of spatial complexity
def lcs(X, Y):
l = [0] * (len(Y) + 1)
for x in X:
l_pre = l[:]
for j, y in enumerate(Y):
if x == y:
l[j+1] = l_pre[j] + 1
elif l[j+1] < l[j]:
l[j+1] = l[j]
return l[-1]
for _ in range(int(input())):
print(lcs(input(), input()))
If this code is rewritten in the form below without defining the function lcs
, it will be TLE even though it should have the same contents. I don't understand why rewriting the function to define it reduces the amount of calculation. If anyone can understand it, I would be grateful if you could let me know.
for _ in range(int(input())):
X, Y = input(), input()
l = [0] * (len(Y) + 1)
for x in X:
l_pre = l[:]
for j, y in enumerate(Y):
if x == y:
l[j+1] = l_pre[j] + 1
elif l[j+1] < l[j]:
l[j+1] = l[j]
print(l[-1])
ALDS1_10_D: Optimal Binary Search Tree --Since it is abbreviated in the spiral book, it is abbreviated once
The main graph search algorithm that searches all the vertices of the graph along the edges is depth-first search (Depth First Search; DFS), which searches the visited vertices with Last In and First Out like a stack. , There is a breadth-first search (BFS) that searches the visited peaks with First In and First Out like a queue. It can be used regardless of whether the graph is directed or undirected.
ALDS1_11_A: Graph
n = int(input())
A = [[0] * n for _ in range(n)]
for _ in range(n):
u, _, *V = map(int, input().split())
for v in V:
A[u-1][v-1] = 1
for a in A:
print(*a)
--In the adjacency list representation, the space complexity of $ O (E + V) $ is sufficient, but it takes the time complexity of the list search to determine whether there is an edge between the specified two vertices. On the other hand, although the adjacency matrix representation requires a spatial complexity of $ O (V ^ 2) $, the time complexity of $ O (1) $ is used to determine whether there is an edge between the two specified vertices. It only costs a lot.
――In this code, the adjacency matrix ʻA` is specifically constructed, but in this problem, it seems that the vertices are actually input only in order from 1, so even if you do not bother to create a matrix and input / output each row, it will pass.
ALDS1_11_B: Depth First Search
n = int(input())
G = [None] * n
for _ in range(n):
u, _, *V = map(int, input().split())
G[u-1] = [v-1 for v in V]
d, f = [None] * n, [None] * n
time = 0
def visit(u):
global time
time += 1
d[u] = time
for v in G[u]:
if not d[v]:
visit(v)
time += 1
f[u] = time
for u in range(n):
if not d[u]:
visit(u)
print(u+1, d[u], f[u])
--Although you can write using the stack, writing using recursion like this makes the code simpler.
--Sequentially visit the undiscovered (white in the three categories below) vertices. A visit to each vertex is completed by first discovering that vertex, visiting all undiscovered vertices (white in the three categories below) adjacent to that vertex, and then finishing. The discovery time and completion time are recorded in d
and f
, respectively.
――The following three types of states of each vertex
d
and f
are None
d
contains the discovery time, but f
is None
d
contains the discovery time and f
contains the completion time.--The time complexity of this algorithm is linear $ \ Theta (V + E) $ with respect to the scale of the graph. Because the time complexity of the loop for node ʻuis the sum of $ \ Theta (V) $ and the complexity of
visit. On the other hand,
visit` is called once for each node, and its complexity is on the order of the number of edges coming out of that node, so the sum for all nodes is $ \ Theta (E) $.
ALDS1_11_C: Breadth First Search
n = int(input())
G = [None] * n
for _ in range(n):
u, _, *V = map(int, input().split())
G[u-1] = [v-1 for v in V]
d = [0] + [-1] * (n-1)
q = [0]
while q:
v = q.pop(0)
for u in G[v]:
if d[u] == -1:
d[u] = d[v] + 1
q.append(u)
for v in range(n):
print(v+1, d[v])
--The time complexity of this algorithm is also linear with respect to the scale of the graph $ \ Theta (V + E) $. Because the time complexity of the while
loop for the queue q
is $ \ Theta (V) $, which occurs because each node is inserted into the queue once and popped as v
, and the inner. It is the sum of the computational complexity of the for
statement. On the other hand, the complexity of each for
statement is on the order of the number of edges coming out of that node v
, so the sum for all nodes is $ \ Theta (E) $.
ALDS1_11_D: Connected Components
n, m = map(int, input().split())
G = [[] for _ in range(n)]
for _ in range(m):
s, t = map(int, input().split())
G[s].append(t)
G[t].append(s)
reps = [None] * n
for s in range(n):
if reps[s] is None:
reps[s] = s
q = [s]
while q:
v = q.pop(0)
for u in G[v]:
if reps[u] is None:
reps[u] = s
q.append(u)
for _ in range(int(input())):
s, t = map(int, input().split())
print("yes" if reps[s] == reps[t] else "no")
--The connected component decomposition of undirected graphs can be done in $ O (V + E) $ using depth-first search or breadth-first search. This code uses breadth-first search. In the case of directed graphs, it is a little more difficult to break down into strongly connected components, which are defined by whether they can go back and forth between each other rather than one way.
--By recording the representative element of the connected component to which each node belongs in reps
, whether or not the two nodes belong to the same connected component is $ O (1) $ if you see if reps
is the same. You can see at. The unvisited nodes are reps
but None
. As a representative element, the node s
that was first visited in the connected component is taken.
ALDS1_12_A: Minimum Spanning Tree
from heapq import heapify, heappop
n = int(input())
G = [[*map(int, input().split())] for _ in range(n)]
heap = [[float("inf"), v] for v in range(n)]
heap[0] = [0, 0]
weights = 0
while heap:
heapify(heap)
w, u = heappop(heap)
weights += w
for i, [w, v] in enumerate(heap):
if 0 <= G[u][v] < w:
heap[i] = [G[u][v], v]
print(weights)
--The minimum spanning tree is defined for concatenated undirected weighted graphs and is known to be greedy. For a set of edges A, take an arbitrary cut that does not intersect any side of A, and add the edge with the smallest weight that intersects that cut to A. Specifically, Kruskal's algorithm, which starts with a forest consisting of trees with only one vertex and sticks the trees together to make a large tree, and Prim's algorithm, which starts with a tree with only one vertex and adds edges to grow the tree. There is. This code is based on Prim's algorithm, recording the additional cost of each vertex and adding in order from the vertex with the lowest additional cost. Each time you add it, the additional cost of each vertex that contains each side that comes out of the added vertex is updated if it can be updated small.
--The time complexity of these algorithms depends on how the data structures used in each algorithm are implemented, but can be achieved log-linearly with respect to the size of the graph. Depending on how Kruskal's algorithm implements disjoint sets, one implementation can achieve $ O (E \ log V) $. Depending on how the priority queue is implemented in Prim's algorithm, $ O (E \ log V) $ can be realized by implementing it with a 2-minute heap like this code. This can be improved to $ O (E + V \ log V) $ by using the Fibonacci heap. This is an improvement because $ V = O (E) $ is better than $ V-1 \ le E $ in the concatenated graph. For general graphs, $ V = \ Omega (E ^ {1/2}) $, and if $ V $ is $ o (E) $, it's a real improvement.
--However, in this code, by using the adjacency matrix representation instead of the adjacency list representation, the for
statement over the heap is executed for each element fetched from the heap, and the size of the heap is large. Considering that the maximum is $ V $, it seems that the order of the computational complexity is getting worse because it contributes $ O (V ^ 2) $ to the computational complexity.
--Furthermore, the heapq
module of python does not implement the so-called DecreaseKey function that reduces the value of a node while maintaining the heap condition, so it is unavoidable to reduce the value of the node and then call heapify
every time. I am recovering the heap condition, but since heapify
is $ O (n) $, it is slower than the DecreaseKey of $ O (\ log n) $. Now, the heap size $ n $ is up to $ V $, and heapify
is called $ V $ times, so the complexity contribution from heapify
is also $ O (V ^ 2) $ in total. .. For the time being, heapq
has an informal function called_siftdown (heap, start_pos, pos)
, which is used to reduce the value of the node in pos
, so that the heap condition is no longer satisfied. You can recover the heap condition of $ O (\ log n) $, but I don't use it because it passes through heapify
even if I don't bother to use it.
--Construct the minimum heap heap
consisting of the additional cost of each node, take out the node with the smallest additional cost, and add the cost to the total weights weights
. In order to update the additional cost of the nodes v
adjacent to the extracted node ʻu, not only the additional cost
w of the extracted node but also the information of which node ʻu
was extracted is required, so it is stored in the heap. Contains a list [w, u]
consisting of the additional cost w
and the node number ʻu. In Python, list comparisons are lexicographic in order from the first component, so
heappop can correctly retrieve the lowest cost node
[w, u] `.
--If G [u] [v]
is added to each node v
adjacent to the extracted node ʻu (that is,
G [u] [v] is non-negative),
vis added. If the cost is lower than
w, update the additional cost of
vfrom
w to
G [u] [v] `
--Although it cannot be used with AOJ, in an environment where scipy can be used such as AtCoder, the minimum spanning tree can be obtained by using the function scipy.sparse.csgraph.minimum_spanning_tree ()
without writing the code yourself.
ALDS1_12_B: Single Source Shortest Path I ALDS1_12_C: Single Source Shortest Path II
These two problems are the same except for the input size and time limit. The single starting point shortest path problem is defined for a weighted directed graph, and there is no solution when there is a negative cycle (a cycle in which the sum of the weights is negative) that can be reached from the starting point.
--A general weighted directed graph can be solved by the Bellman-Ford algorithm of $ O (VE) $. This algorithm can detect the existence of a negative cycle reachable from the starting point
--If you know that the graph is acyclic or DAG, then there is no negative cycle. In this case, the amount of calculation can be reduced to $ O (V + E) $ by using topological sort.
――On the other hand, even if all the weights are known to be non-negative, there is no negative cycle. In this case, it can be solved by Dijkstra's algorithm. The complexity of this algorithm depends on the implementation of the priority queue: $ O ((V + E) \ log V) $ with the 2-minute heap and $ O (V \ log V + E) with the Fibonacci heap. $
I don't know if this problem is non-cyclic, but I know that all the weights are non-negative, so I solve it with Dijkstra's algorithm. Dijkstra's algorithm is similar to Prim's algorithm of the previous minimum spanning tree, and determines the shortest path and its distance $ d $ in order from the vertex with the closest distance $ d $ from the start point. Each time it is confirmed, the distance $ d $ of each vertex that contains each side that comes out of the confirmed vertex is updated if it can be updated shortly.
In the following code, which is very similar to the code of Prim's algorithm for the minimum spanning tree problem, I can solve but II becomes TLE:
from heapq import heapify, heappop
n = int(input())
G = [None] * n
for _ in range(n):
u, k, *vc = map(int, input().split())
G[u] = {vc[2*i]:vc[2*i+1] for i in range(k)}
d = [None] * n
heap = [[float("inf"), v] for v in range(n)]
heap[0] = [0, 0]
while heap:
heapify(heap)
d_u, u = heappop(heap)
d[u] = d_u
for i, [d_v, v] in enumerate(heap):
if v in G[u]:
d_v_new = d_u + G[u][v]
if d_v > d_v_new:
heap[i] = [d_v_new, v]
for v in range(n):
print(v, d[v])
The problem is the $ O (V ^ 2) $ part executing the for
statement across the heap for each node ʻu` taken from the heap.
--I want to narrow the for
statement over the heap to the for
statement over the adjacency list, but then I don't know where the adjacent node v
is stored in the heap, and the weight $ d $ of v
works well. Cannot be updated small. The heap of python uses a list instead of OrderedDict etc., and it costs $ O (n) $ to search for a value, so if you search every time where the adjacent node v
is stored in the heap, Computational complexity does not go down after all
--Therefore, I gave up updating the weight of the node v
contained in the heap, and added the updated node v
with a small weight to the heap by pushing each time. The heap will contain multiple identical nodes with different weights, but since it is the smallest heap, it will be retrieved from the lightest node added most recently. Since the nodes that failed to be deleted and were taken out from the second time onward have a heavier weight than when they were taken out the first time, the ʻif d [v]> d_vpart does not become
True` and does nothing in particular. Harmless
Based on the above, if you rewrite as below, II will also pass:
from heapq import heappop, heappush
n = int(input())
G = [None] * n
for _ in range(n):
u, k, *vc = map(int, input().split())
G[u] = {vc[2*i]:vc[2*i+1] for i in range(k)}
d = [0] + [float("inf")] * (n-1)
heap = [[0, 0]]
while heap:
d_u, u = heappop(heap)
for v in G[u]:
d_v = d_u + G[u][v]
if d[v] > d_v:
d[v] = d_v
heappush(heap, [d[v], v])
for v in range(n):
print(v, d[v])
--Although it cannot be used with AOJ, in an environment where scipy can be used such as AtCoder, you can use the function scipy.sparse.csgraph.dijkstra ()
without writing the code of Dijkstra's algorithm yourself.