[PYTHON] How to solve dynamic programming algorithm problems (as seen by beginners)

Dynamic Programming (DP)

It is one of the problems often asked in programming contests such as LeetCode and AtCoder, and when coding by round-robin (blue force), it becomes an exponential complexity (for example, $ O (2 ^ n) $ ). You can significantly reduce the amount of code calculation (for example, $ O (n ^ 2) $, etc.). Therefore, I think that it is a way of thinking that can be useful not only in programming contests but also in everyday coding.


Caution)

Why dynamic programming is hard to understand at first glance

For example, in the case of wikipedia : ** Definition **

Algorithms are not defined in detail, but are a general term for algorithms that satisfy the following two conditions. ** Definition 1. ** Use of inductive relations: Use the solutions of smaller problem examples (subset sum problems) and calculation results to solve larger problem examples using inductive relations. ** Definition 2. ** Record calculation results: Record small problem examples, calculation results, and avoid repeating the same calculation. In order to efficiently refer to inductive relationships, calculation results are managed with integers, characters and their combinations as headings.

While explaining these things, we will look at the procedure for solving dynamic programming problems.

Dynamic programming problem answering procedure

Example: Knapsack problem

Take the famous knapsack problem as an example. problem: You are a thief I sneaked into a watch store to steal my watch. When the array of clocks is represented by $ S $, there are a large number of expensive clocks of various prices up to $ S [0], ... S [i], ..., S [n-1] $. Yes, but there is a fixed weight of $ W $ that you can take home. How can I find the highest total price within the constraint weight limit of $ W $?

組み合わせ.png Figure 1 Knapsack problem

For example, if there are 1000 clocks ($ n = 1000 $), if you try all the clock combinations, you will consider two ways for all clocks, including or not including each clock, so $ 2 ^ {n} = 2 ^ {1000} ≒ 10 ^ {301} $, which is an exponentially enormous amount of calculation.

1. Determine if dynamic programming may be used

When tackling a question within a limited answer time, I think it is necessary to first make an idea as to whether or not dynamic programming can be applied to avoid unnecessary consideration. Therefore, the purpose here is not to prove that dynamic programming can be applied, but to get an idea of whether dynamic programming can be applied to this problem in a short time.

Now, whether dynamic programming can be applied is whether what is written in the definition can be applied. How can I check that? It is a candidate for dynamic programming if the following ** Condition I. ** and ** Condition II. ** are likely to be found in the first few minutes.

Condition I. Subset sum problem can be solved

According to ** Definition 1. **, it is essential to be able to solve a partial problem by decomposing a large problem example. What kind of case does this "subset sum problem" mean? That is, if $ S $ is an array of input data, a solution can be found for a part of $ S $ under the same conditions as the whole solution.

Specifically, in the knapsack problem, when considering only the watches after $ i $ (= $ S [i:] $) from among the many watches (= $ S $) in the watch shop, for example, It means that you can find the maximum total price that will fit in a knapsack under the constraint of a weight limit of $ w $. We will write $ V [i :, w] $ as the value to be maximized or minimized in this way. In the case of the knapsack problem, $ V [i :, w] $ is the total price after the clock number $ i $ ($ i to n $) when the remaining weight limit in the knapsack is $ w $. Although the amount of calculation is enormous, $ V [i :, w] $ is a brute force approach (brute force) as described in the explanation of the example, and all combinations of watches with a weight limit of $ w $ or less can be considered. If so, it can be calculated as the value with the largest total price. Therefore, it seems that this condition is met.

A part of the array is the following three types of partial arrays.

This time, as an example, the case of $ S [i:] $ --array suffix (suffix) is taken.

Condition II. The problem can be solved recursively

Secondly, it is necessary to confirm whether the subproblem that can be solved by ** Condition I **. Can be solved recursively. This is what is called an "inductive relationship" in Definition 1. The fact that it can be solved recursively means that, for example, if the maximum total price up to $ S [i:] $ is expressed by $ V [i:] $, there is a relationship that can be expressed by the following recurrence formula. It is a thing.

$ V [i:] = V [i + 1:] + v_ {i + 1} $ ($ v_ {i + 1} $ is a constant) ・ ・ ・ ①

In other words, if you can guess that $ V [i:] $ is likely to be calculated with the computational complexity $ O (1) $ by using $ V [i + 1:] $, it is good at this stage.

In the case of the knapsack problem, at this stage it will be a little complicated considering the weight limit, so ignore it and think that $ v_ {i} $ is the price of the watch $ i $, then by ① $ V [i:] $ You can find the maximum value of. That said, since the weight limit is ignored, it is simply the total price of all watches, but considering the weight limit later, we were able to confirm a recursive relationship, so at this stage it is in condition II. Suppose you have a clue.

2. Think of a recursive function (or recurrence formula)

So far, by examining the above ** Condition I. ** and ** Condition II. ** in a short time, I have come to the conclusion that this problem is likely to be solved by dynamic programming. Therefore, we finally consider an algorithm that uses dynamic programming. For the sake of explanation below, I will write the formula first and then the code, but when solving the actual problem, you can think from the code. In my case, the code is easier to think of than the expression, and in the end it is the code that is needed to solve the problem, so I think from the beginning.

For ** Condition II. **, let's formulate it considering the weight limit. For example, since we are thinking of subarray = suffix this time, let's think of each clock $ S [i] $ in descending order from $ i = n-1 $ to $ i = 0 $. As mentioned earlier, you need to think about whether you want to put the $ i = n-1 $ watch in the knapsack first or not. In the formula, ① can be expanded and expressed as follows. If you want to put a $ i = n-1 $ watch in your knapsack: $ V [n-2 :, w] = V [n-1 :, w-weight_ {n-1}] + value_ {n-1} $ ・ ・ ・ ③ If you don't put $ S [n-1] $ in the knapsack: $ V [n-2 :, w] = V [n-1 :, w] $ Where $ weight_ {n-1} $ is the weight of the watch $ n-1 $ and $ value_ {n-1} $ is the price.

Once $ V [n-2 :, w] $ is obtained from the above formula, next, for each of formulas ③ and ④, consider the case where i = n-2 is put in the knapsack or not.

Therefore, in order to apply this relationship to all $ i = 0, ..., n-1 $, generalize by replacing equations ③ and ④ with $ n-2 = i, n-1 = i + 1 $. Then you can write: To put $ S [i + 1] $ in knapsack: $ V [i :, w] = V [i + 1 :, w-weight_ {i + 1}] + value_ {i + 1} $ ・ ・ ・ ⑤ If you don't put $ S [i + 1] $ in the knapsack: $ V [i :, w] = V [i + 1 :, w] $

Please note that the weight limit of $ w $ will be reduced by the weight of the watch when you put it in the knapsack.

If you write this as python code, it looks like this: (* Here, for the sake of clarity, the direct expression is converted into a code, and it is actually necessary to consider the boundary conditions, etc.)

# weight:Weight of each watch:
#Example)weight = [10, 20, 30, 35]
# value:Price of each watch=
#Example)value = [5,12,17,20]

def V(i, w):
  #When including clock i:
  if w >= weight[i]:
    V1 = V(i+1, w-weight[i]) + value[i]    
  #If the clock i is not included:
  V2 = V(i+1, w)

Now that we have two recurrence formulas, we can combine the two recurrence formulas into one by considering ** Condition I. **.

3. Make a recurrence function (recurrence formula) that solves a partial problem

In ** Condition I. **, we considered that it is possible to obtain the maximum total price under the weight limit of $ w $ by brute force (brute force) up to the clock number $ i $. And it should be possible to find it using the recurrence formula as well. When there are two values of total price $ V [i :, w] $ as in equations ⑤ and ⑥, if the weight limit $ w $ is the same, the one with the smaller total price can be the maximum value. Since there is no such thing, it can be immediately determined that it is unnecessary. Therefore, the two recurrence formulas ⑤ and ⑥ can be summarized in the following formulas. $ V [i :, w] = max (V [i + 1 :, w-weight_ {i + 1}] + value_ {i + 1} $, $ V [i + 1 :, w] $) ・ ・ ・ ⑦

Now we have an expression to find the maximum value of the subproblem. Written in python, it looks like this:

# weight:Weight of each watch:
#Example)weight = [10, 20, 30, 35]
# value:Price of each watch=
#Example)value = [5,12,17,20]

def V(i, w):
  #When including clock i:
  V1 = 0
  if w >= weight[i]:
    V1 = V(i+1, w-weight[i]) + value[i]    
  #If the clock i is not included:
  V2 = V(i+1, w)
  return max(V1, V2)

Each time $ V (i, w) $ is called, we consider both with and without the watch $ i $, but the one with the same weight limit of $ w $ or less and the smaller total price is the total maximum price. Since it will never be, only the one with the larger total price is set as the return value by $ max () $. From this, you can see that the combination that cannot be the maximum value can be determined at the time of $ i $.

As a concrete example, suppose you have a knapsack with a weight limit of $ W = 5 $ and a watch like the one below.

clock 0 1 2 3
weight 2 4 2 3
price 12 20 10 17

A concrete example of this recursive function call is shown in the figure below.

再帰関数呼び出し.png Figure 2 Recursive call tree

The function call of the recursive function is from top to bottom, but the return value will be returned from the deepest callee. As a result, the weight limit is calculated from top to bottom as shown by the blue arrow, but the maximum price is calculated from bottom to top as shown by the red arrow. Each node has a red arrow from the two lower child nodes, but the child node on the right has a $ i $ clock in the knapsack, and the child node on the left has $ V [ i :, w] Represents the return value of $. Only the one with the larger maximum price returned from the child node is adopted and returned to the upper node, and finally the maximum price 29 is obtained.

By the way, I understand how to get the maximum price without brute force by using the recurrence formula (recursive function), but by using the memoization of ** Definition 2. **, the memory Instead of sacrificing usage, you can reduce the amount of calculation.

4. Make a memo

Figure 2 In the line of $ i = 2 $ in the recursive call tree, there is a node surrounded by purple. These nodes call $ V [i :, w] $ twice, which has the same weight limit $ w $ as the current clock $ i $. That is, since all the calculations of the child nodes from this node are the same, if you make a note of it when you first call it, you can omit the subsequent calculations. How many note sizes do you need? If the weight of all clocks is an integer, the range of possible values for $ w $ is $ 0 \ leq w \ leq W $, so W + 1 memos are required. Since the range of possible values for $ i $ is $ 0 \ leq i \ leq n-1 $, $ n $ worth is required. Therefore, the size of the memo is $ W × (n-1) $.

Until now, we did not consider the boundary conditions of arrays, but here we also consider the boundary conditions of arrays. Written in Python, it looks like this:


# weight:Weight of each watch:
weight = [2, 4, 2, 3]
# value:Price of each watch:
value = [12,20,10,17]
n = len(value)
#Knapsack limit capacity:
W = 5

#Secure notes
memo = [[0] * (W+1) for _ in range(n)] 

def V(i, w):
  if i >= n:
    #After the end of the array
    return 0

  if memo[i][w]:
    #If in the memo
    return memo[i][w]

  #When including clock i:
  V1 = 0
  if w >= weight[i]:
    V1 = V(i+1, w-weight[i]) + value[i]    
  #If the clock i is not included:
  V2 = V(i+1, w)
  memo[i][w] = max(V1, V2)
  return memo[i][w]

V(0, W)

5. Confirmation of calculation amount

If it is already in the memo, you do not have to calculate it, so you only have to calculate the recursive function for the number of memos. Therefore, the amount of calculation is $ O (nW) $.

It is often explained by the index of the horizontal axis arrangement, the graph or the table of the vertical axis weight, but for me it was more intuitive to explain with the recursive function as the axis, so I admit such an article. I tried it.

[^ 1]: From https://www.youtube.com/watch?v=OQ5jsbhAv_M

Recommended Posts

How to solve dynamic programming algorithm problems (as seen by beginners)
Getting started on how to solve linear programming problems with PuLP
Dedicated to beginners! How to learn programming without spending as much money as possible
An introduction to object-oriented programming for beginners by beginners
[For beginners] How to study programming Private memo
13th Offline Real-time How to Solve Writing Problems in Python
Linear programming by Karmarkar's algorithm
The 17th Offline Real-time How to Solve Writing Problems in Python
How to write offline real time Solve E04 problems in Python
JOI2019 / 2020 1st qualifying 3rd How to solve problems A and B
How to write offline real time Solve F01 problems with Python
How to solve simultaneous linear equations
How to solve slide puzzles and 15 puzzles
[Introduction to Infectious Disease Model] World Infection Status as Seen by MACD ♬