[PYTHON] I couldn't understand Fence Repair of the ant book easily, so I will follow it in detail.

Definition of terms

** Ant book **

-Programming Contest Challenge Book --A bible-like book in the area of competitive programming -Comprehensively posted basic knowledge necessary for participating in competition professionals

Fence Repair

The problem is to use the greedy algorithm, but I couldn't understand why it became greedy, so I started by trying to verify it.

Problem statement

Farmer John is trying to cut $ N $ of boards out of a very long board to repair the fence. The length of the board you are trying to cut is $ L_1, L_2, \ dots, L_N $, and the length of the original board is just the sum of these. When cutting a board, it costs as much as the length of the board. For example, suppose you want to cut a $ 5,8,8 $ $ 3 $ board out of a $ 21 $ long board. Cutting a $ 21 $ length plate into $ 13 $ and $ 8 $ length plates costs $ 21 $. Cutting that $ 13 $ board into $ 5 $ and $ 8 $ boards costs $ 13 $. It costs $ 34 $ in total. At the minimum, how much does it cost to cut out all the boards?

Constraints

For many problems, it can often be generalized to some kind of graph structure, whether it is a tree or not. The example shown in this question sentence is expressed in a binary tree as follows. fence-repair_01.png The cost at each disconnect is the value of the parent node, so the total cost is the sum of all the values of the non-leaf nodes. In the above example, the sum of the non-leaf nodes (white) is $ 13 + 21 = 34 $, which matches the total cost of the problem statement.

The purpose is to minimize the total cost at this time.

Answer policy

Ant book commentary policy

--Remove $ 2 $ of the given boards in ascending order and add the merged length boards in their place. --Add the sum of the lengths of the merged boards to the cost and recurse until the last one. ――The cost obtained at that time is minimized.

When I read it, it seemed to be correct, but I wasn't really convinced. I think that the purpose is to convey a change in the idea of thinking from the bottom up because the condition classification is awkward if you solve it from the top down, but I do not understand this when a more difficult problem is explained later. I felt like I was stuck, so I decided to pursue it.

Fixed conditions and variable factors

Seeing what is fixed (premise) and what kind of operation causes fluctuation is the basic policy for solving any problem. Since this is the problem introduced in the chapter on greedy algorithm, we can understand meta-wise that there is always an operation that reduces the total cost. In the actual solution situation, it is necessary to select the algorithm including the judgment.

Now, in this question, the lengths of all the plates at the completion of cutting are given, so the number of nodes that will eventually become leaves is fixed. In other words, the tree structure changes depending on the order in which the leaves are combined.

Verification of variable factors

Even if you try to grasp how it changes, even if you suddenly think about it in a complicated case, it will fail. Descartes, the great mathematician, also "divide the difficulties" </ b>.

Dividing into insanely simple cases and thinking is an important scheme for solving any problem. So what's the simplest case? Let's think together.

Case of N = 1

Needless to say, the case of $ N = 1 $ is the simplest. In other words, the original plate will not be cut. With this, there is no fluctuation or shit, and the cost is always $ 0 $. Let's increase $ N $.

N = 2 case

This is the case where the number of disconnections is $ 1 $. Again, there is no variation when the length of the plate after cutting is fixed. Let's increase $ N $ further.

N = 3 case

Sorry I made you wait. Variations finally come out from $ N = 3 $. First of all, for the sake of simplicity, let's put the length of the board after cutting, which is $ 3 $, as $ A \ leqq B \ leqq C $. This would be an assumption that would not lose generality.

Well, the way to cut here is a $ 3 $ pattern.

  • (A+B+C) \rightarrow (A+B),C \rightarrow A,B,C
  • (A+B+C) \rightarrow A,(B+C) \rightarrow A,B,C
  • (A+B+C) \rightarrow B,(C+A) \rightarrow A,B,C

It's difficult to understand even if I write it in letters, so I made a figure. fence-repair_02.png Only the difference in the red part of the figure affects the total cost. The position of the other parts is different, but the sum value does not change. In other words, minimizing the total cost in this case is equivalent to choosing the smallest of $ (A + B), (B + C), (C + A) $.

If you look carefully to see if there are enough variations, choose $ 2 $ out of $ 3 $, that is, $ {} _3 \ mathrm {C} _2 = {} _3 \ mathrm {C} _1 = 3 $, which is certainly a $ 3 $ pattern. There can only be. So, of course, which pattern is the smallest is $ (A + B) $, which is $ 2 $ selected from the smallest value.

In summary, if $ N = 3 $, you can cut out by selecting $ 2 $ from the smallest. Using this fact, consider the case of $ N = 4 $.

N = 4 case

With $ N = 3 $, we were able to narrow down the pattern that minimizes the total cost. From a meta point of view, since it is a greedy method, it seems that it will be possible to narrow down the pattern with a recurrence formula after this.

Now, the case of $ N = 4 $ has $ 4 $ boards. If you divide the case in the first cut, the pattern will be divided depending on whether you cut the board $ ABCD $ by $ 1 + 3 $ or $ 2 + 2 $. fence-repair_03.png

1 + 3 patterns

The pattern of carving with $ 1 + 3 $ is an extension of the case of $ N = 3 $ to the root side. After the first $ 1 $ cut, it will have the same shape as $ N = 3 $, and the board $ ABC $ will only change to $ \ {ABC, ABD, ACD, BCD \} $.

In other words, there is a total of $ 4 \ times 3 = 12 $ pattern, but in reality it comes down to the $ 4 $ pattern of which one to cut first. Now, the generalization of the total cost for the $ N = 3 $ case is $ 2A + 2B + C $. (Choosing the smaller $ 2 $ out of the $ 3 $ is the same as choosing the largest)

On the other hand, if you add $ D $, which is $ A \ leqq B \ leqq C \ leqq D $, the case is $ N = 4 $. The following $ 4 $ pattern is used to summarize the total cost.

-$ 3A + 3B + 2C + D $: Cut $ D $ first -$ 3A + 3B + 2D + C $: Cut $ C $ first -$ 3A + 3C + 2D + B $: Cut $ B $ first -$ 3B + 3C + 2D + A $: Cut $ A $ first

Now, let's stop here and take a look at the $ 2 + 2 $ pattern.

2 + 2 patterns

If you think about it for a moment, you can see that the total cost will be $ 2 \ times (A + B + C + D) $ regardless of the combination.

Compare together

By dividing the cases so far, we found that the patterns that must be considered as the case of $ N = 4 $ are narrowed down to $ 5 $ in total, including the $ 4 $ pattern mentioned earlier.

Now, let's subtract $ 2 \ times (A + B + C + D) $ from all cases to compare the magnitude relationship of $ 5 $.

-$ ; ; ; ; ; ; 0 ; ; ; ; ; ; $: First cut into $ 2 + 2 $ sheets -$ A + B-D $: First cut $ D $ -$ A + B-C $: First cut $ C $ -$ A + C-B $: Cut $ B $ first -$ B + C-A $: Cut $ A $ first

What is the lowest possible cost value? Considering $ A \ leqq B \ leqq C \ leqq D $, we can see the following.

- $ ; ; ; ; ; ; 0 ; ; ; ; ; ; $: Minimum if $ A + BD $ is positive </ strong> FONT> - $ A + B-D : Minimum in negative cases - A + B-C $: Can be negative, but not a candidate because the pattern up $ 1 $ by subtracting the larger number $ D $ is smaller -$ A + C-B \ geqq 0 $: Not a candidate because it is always $ 0 $ or more due to the size relationship -$ B + C-A \ geqq 0 $: Not a candidate because it is always $ 0 $ or more due to the size relationship

In other words, the minimum case changes depending on the judgment result of $ A + B \ leqq D $. </ b>

Return to the explanation of the ant book

Take a closer look at $ A + B \ leqq D $. In short, this is the same as adding the merge result of $ A + B $ and sorting again.

This is because $ C $ cannot be the maximum value in any case at the time of $ C \ leqq D $. Choosing $ 2 $ from the smaller of the $ 3 $ is equivalent to choosing the maximum of $ 1 $. You don't need the $ C $ information that can't be maximized.

Let's visualize this in a more understandable way. In the case of $ N = 4 $, consider the problem of selecting $ 2 $ from the smallest side and merging $ A + B $ as $ M $.

As I mentioned earlier, there are only $ 2 $ cases below. $ M $ is the largest case, or $ D $ is the largest case. fence-repair_04.png

That's right. After all, even if you knead various things from the top down, if you merge and replace from the smallest one, the case of $ N = 4 $ will end up in the case of $ N = 3 $. If you return $ A + B $ to $ M $ again for confirmation, it will be in the form of case classification that was discussed from the top down as follows. fence-repair_05.png

Recapture from the bottom up

Now let's look at the cost of each leaf. It's easy to see that the $ 1 $ leaf's total cost is the product of the leaf node's value and depth. Try tracing the leaf $ A $ in the previous figure to the root. You can see that all $ 1 $ main roads leading up to the root contain $ A $, and other routes do not contain $ A $.

And since it is difficult to think about the $ 2 $ axis of value and depth at the same time, think from the perspective of the same depth. Even if it is unknown what kind of tree structure it will be, the smaller the value, the lower the cost at the same depth. This means that it is always greedy to fill with the smallest pair, considering that the leaves are filled in order from the deepest. This is because, again, “the smaller the value, the lower the cost if the depth is the same”.

If you try to do this top-down, you'll have to make multiple selections at a depth of $ 1 $, which suddenly creates a case. As you can see that $ N = 4 $ is troublesome enough, it means that the maximum case of $ N = 20000 $ is virtually impossible.

Lessons learned

--Difficulty is divided --Try from the opposite direction --If you have multiple axes, try fixing one of them.

Please refer to the Programming Contest Challenge Book for the actual answer code.

Digression

The competition pro started in the $ 8 $ month of $ 2020 $. I usually post articles on my Personal Blog.

Recommended Posts