Introduction

When I was worried about "TLE even though it seems to be in time for the calculation amount ..." while devoting myself to the competition pro, it was discovered that the sort part of the preparation was a bottleneck. Share the points you're addicted to.

tuple array sort

I will briefly explain the difference between with and without key specification.

key specified

a = [(3,3), (2,2), (3,1), (4,2)]
a.sort(key=lambda x: x[0])
print(a) # => [(2, 2), (3, 3), (3, 1), (4, 2)]

Sort is performed according to the size of the specified key. If the key elements are equal, the order is preserved.

The sort () method is guaranteed to be stable. Sorting is stable as long as it is guaranteed that the relative order of equal elements will not change. (Official documentation)

`(x[0],x[1]))It is possible to specify multiple keys as tuples like`


 In this article, we will consider specifying a single element as a key.



## No key specified
 If you do not specify any key
 "Compare size with the 0th element of tuple"-> "Compare size with the 1st element if they are the same"-> "..."
 Sort is done in this way.

```python
a = [(3,3), (2,2), (3,1), (4,2)]
a.sort()
print(a) # => [(2, 2), (3, 1), (3, 3), (4, 2)]

At this time, it seems that the comparison operation between tuples is performed internally.

key specifies a function that takes one argument and is used to retrieve the comparison key from each element of the list (for example, key = str.lower). The key corresponding to each item is calculated once and used for the entire sorting process. ** The default value None means that the values in the list are sorted directly without calculating another key value. ** (Official documentation)

If you just "sort by the first element of tuple", you can achieve the purpose with or without key specification. If you don't specify the key here, thinking ** "Then you don't need to specify the key" **, you may fall into an unexpected trap. ** Comparison between tuples is slow. ** **

Experiment

"Comparison between tuples" vs. "Comparison between tuple elements"

Check how much speed difference there is between the two. The execution environment is AtCoder's code test (PyPy3). There are 3 elements in each tuple.

import random
random.seed(1)

''' n:3 ways
N = 1*(10**3)
N = 3*(10**3)
N = 5*(10**3)
'''

tl = [] # list of tuples
for i in range(N):
    l = random.randint(1,N)
    r = random.randint(1,N)
    tl.append((l,r,i)) # 3 elements in tuple

''' 
# 1.Comparison between tuples
for i in range(N):
    for j in range(N):
        res = tl[i] < tl[j]
# 2.Comparison of tuple elements
for i in range(N):
    for j in range(N):
        res = tl[i][0] < tl[j][0]
'''

The measurement results are as follows.

N	1.tuples	2.Tuple elements
1*10^3	78 ms	17 ms
3*10^3	672 ms	129 ms
5*10^3	1875 ms	368 ms

Since there are three tuple elements, I thought, "Is it an error of about 3 times at most?", But there was a big difference. scared.

tuple array sort "key not specified" vs. "key specified"

import random
random.seed(1)

''' n:3 ways
N = 1*(10**5)
N = 3*(10**5)
N = 5*(10**5)
'''

tl = [] # list of tuples
for i in range(N):
    l = random.randint(1,N)
    r = random.randint(1,N)
    tl.append((l,r,i)) # 3 elements in tuple

'''
3. tl.sort()
4. tl.sort(key=lambda x:x[0])
'''

The measurement results are as follows.

N	3.No key	4.With key
1*10^5	198 ms	98 ms
3*10^5	735 ms	359 ms
5*10^5	1361 ms	708 ms

Although not as much as in the first experiment, the difference was about twice as large.

in conclusion

It may not be a big difference, but depending on the problem, this can be addictive. Some people may have seen the code of the experiment and wondered, "Is that the problem?" By the way, it seems that `` `itemgetteris faster than lambda``` for key specification.

Sort of tuple array can be accelerated by specifying key (Python)