[PYTHON] [Competition Pro] Solve the number of M cards taken out from N cards using a route [Explanation in the figure]

I found a problem that looked interesting, so I tried to solve it.

problem

There are $ N $ cards, and $ A_i $ is written on the $ i $ th card.
Mr. T pulls out $ M $ sheets from this at the same time. How many pairs of numbers are written on the removed card?
However, it does not distinguish between pairs that are just swapped, such as $ (1,2) $ and $ (2,1) $.

For example, consider the case of $ N = 5 $, $ M = 2 $, $ \ {A \} = \ {1,2,2,3,4 \} $. In this case, $ 7 $ of $ (1,2), (1,3), (1,4), (2,2), (2,3), (2,4), (3,4) $ There is a pair of.

I'm not afraid of counting anymore ~ 35 selections of competition pro counting problems ~ --Qiita Some problems have been modified.

The person who writes this article

The person writing this article has no experience in competitive programming. I don't know how to be a competitive professional, so please take a good look at it.

Prerequisite knowledge

Number of grid paths

Consider such a grid-like path.

n_m_lattice.png

If you can only go to the right or up, the number of routes from $ P $ to $ Q $ is

{}_{n+m} \mathrm{C} _m

is.

Calculate the number of routes for each step

Due to the problem of the number of grid paths mentioned above, the horizontal axis is $ p $ and the vertical axis is $ q $, and the number of paths to a certain point $ (p, q) $ is $ (p-1, q) $. And the number of routes to $ (p, q-1) $.

p_q_plus_from_previous.png

Taking advantage of this property, the distance [^ 1] from $ P $ is increased by $ 1 $. When you reach the end, you will get the value you want. Let's check with the grid of $ 4 \ times 3 $.

[^ 1]: The distance here is the length of the nearest intersection from the intersection of the grid as 1.

step_lattice_1.png step_lattice_2.png

It turns out that the number of routes in the $ 4 \ times 3 $ grid is 35. This result matches $ {} _7 \ mathrm {C} _3 = 35 $.

Grid deformation

Such a grid will appear in the following sections. The number of combinations does not change, only the vertical axis is slanted.

sloping_lattice.png

Algorithm of the subject

There are $ N $ cards, and $ A_i $ is written on the $ i $ th card.
Mr. T pulls out $ M $ sheets from this at the same time. How many pairs of numbers are written on the removed card?
However, it does not distinguish between pairs that are just swapped, such as $ (1,2) $ and $ (2,1) $.

Consider the case of $ N = 7 $, $ M = 4 $, $ \ {A \} = \ {1,2,2,3,3,3,4 \} $.

Creating a grid

First, create the following grid.

choice_lattice.png

Write the numbers contained in $ A $ on the horizontal axis of the grid. If the same number is included, write it consecutively. The solid black line indicates whether to use the numbers written below. Going to the upper right means using that number, and going to the right side means not using that number.

For example

choice_1234.png

The pink path represents selecting $ (1, 2, 3, 4) $.

Similarly

choice_2233.png

Means to select $ (2, 2, 3, 3) $.

Duplicate combination

in this way,

choice_2334_1.png

When

choice_2334_2.png

Both represent selecting $ (2, 3, 3, 4) $, and there are multiple routes for a combination.

Avoid duplicate combinations

Change the route so that there is one route for one combination. When selecting a number, change to the route ** If there is the same number, select all the same numbers to the right of the selected number **. This allows you to eliminate duplicate combinations.

change_choice_path.png

The light blue line is the deleted route, and the red is the added route. If you select $ 2 $ on the left, select $ 2 $ on the right as well. Similarly, if you select the leftmost $ 3 $, select the remaining two $ 3 $, and if you select the middle $ 3 $, select the right $ 3 $.

As a result, the only routes to select $ (2, 3, 3, 4) $ are as follows.

choice_2334_fix.png

Enumeration

Count according to the created route.

step_wind_lattice_1.png step_wind_lattice_2.png step_wind_lattice_3.png step_wind_lattice_4.png step_wind_lattice_5.png step_wind_lattice_6.png step_wind_lattice_7.png step_wind_lattice_8.png

By this operation, it turns out that the number of selecting 4 from $ \ {A \} = \ {1,2,2,3,3,3,4 \} $ is $ 11 $. It was.

How to select duplicate numbers

Now, pay attention to the part shown in red.

three_from_1.png

This number is the sum of the parts enclosed in orange.

three_from_2.png

Similarly, the part shown in red in the figure below is the sum of the parts surrounded by orange.

three_from_3.png

Similarly, the part shown in red in the figure below is the sum of the parts surrounded by orange.

two_from_1.png two_from_2.png two_from_3.png

In a normal grid, the number of routes to the point $ (p, q) $ is the sum of the number of routes to $ (p-1, q) $ and the number of routes to $ (p, q-1) $. did.

Similarly, if there are two identical numbers, the number of routes to $ (p, q) $ is $ (p-2, q) $, $ (p-1, q-1) $, $ (p, q). -2) $ It is the sum of the number of routes.

Similarly, if there are three identical numbers, the number of routes to $ (p, q) $ is $ (p-3, q) $, $ (p-2, q-1) $, $ (p-1). , p-2) $, $ (p, q-3) $ The number of routes is added.

You can see that the route map can be rewritten as follows.

duplicated_choice_lattice.png

Enumeration (again)

Count again with the rewritten route.

step_last_lattice_1.png step_last_lattice_2.png step_last_lattice_3.png step_last_lattice_4.png step_last_lattice_5.png

So far, we have seen a concrete example of an algorithm that selects $ M $ cards from $ N $ cards that contain duplicate numbers.

algorithm

Generalize the above algorithm.

-$ A $ represents a set of cards --The number of routes to the $ point (p, q) $ is represented by $ route (p, q) $. $ route (0, 0) = 1 $ --The distance from the origin is represented by $ step $. The initial value of step is $ 0 $

  1. Repeat the following until $ A $ is empty
  2. Extract all one kind of number $ a $ from $ A $ and delete it from $ A $. Let $ c $ be how many $ a $ are
  3. Add $ c $ to $ step $
  4. For an integer $ (p, q) $ that satisfies $ p + q = step \ (0 \ leqq p \ leqq N-M, 0 \ leqq q \ leqq M) $ $ route (p, q) = route (p --c, q) + route (p --c + 1, q -1) + route (p --c + 2, q --2) + ... + route (p) --1, q- (c -1)) + route (p, q --c) $
  5. $ route (N-M, M) $ is the number you want.

Algorithm modification

The above algorithm is $ M $, such as $ N = 7 $, $ M = 2 $, $ \ {A \} = \ {1,2,2,3,3,3,4 \} $ It doesn't work if you have numbers with higher multiplicity (this time there are 3 $ 3 $). If there are more than $ M $ of duplicate numbers, you need to reduce the multiplicity to $ M $ in advance.

Algorithm improvements

Sort by the degree of duplication of numbers

Arrange the numbers by multiplicity. Write the number that appears only once on the far left.

sort_by_duplication.png

The number when it appears only once can be calculated.

calc_one_combi.png

(This is why $ step $ does not start at $ 0 $ in the linked program.)

In the case of a simple lattice, the coefficient when $ (x + y) ^ n $ is expanded appears on the diagonal line (Pascal's triangle / binomial theorem).

Similarly, there is regularity when two identical numbers appear, but I haven't generalized them well yet (I). The number of two appearing is related to the coefficient of $ (x ^ 2 + x + 1) ^ n $. This property can be used to further improve the algorithm.

code

I implemented the above algorithm in Python (version 3.7). combination.count_uniq_combination_lattice

I check if the program is correct by checking if it is the same as this straightforward answer. combination. count_uniq_combination_simple

Reference link

Afterword

When I first saw the problem, I thought, "I don't know where to start ..." Now that I've written the commentary, I'm starting to think, "Oh, this is surprisingly simple, and it may be common sense in the professional competition world." Perhaps there is already a similar commentary article, but I will publish it because it was good for my own organization.

Recommended Posts

[Competition Pro] Solve the number of M cards taken out from N cards using a route [Explanation in the figure]
Find the number of days in a month
Find out the maximum number of characters in multi-line text stored in a data frame
[Python] Representing the number of complaints from life insurance companies in a bar graph
What seems to be a template of the standard input part of the competition pro in python3
Maya | Find out the number of polygons in the selected object
Find out the apparent width of a string in python
Examine the margin of error in the number of deaths from pneumonia
Python --Find out number of groups in the regex expression
Get the number of readers of a treatise on Mendeley in Python
[Completed version] Try to find out the number of residents in the town from the address list with Python
A story about creating a program that will increase the number of Instagram followers from 0 to 700 in a week