Environment: Python 2.7
When you pass what you want to sort, the elements are combined in a tournament formula and the two elements are manually compared. The final result is a complete ranking with no overlap in ranking. (Total order relation)
Maybe there is a better implementation, but I wanted to see if the idea was correct, so I made it.
__ Addendum (2014/12/04) __ Too natural implementation → Re: Human-powered sorting [Python]
The outline of the algorithm is as follows.
First, place all the elements in a tournament format and let them compete (pictured above). At this time, keep the tuple that has the winner at the beginning of the index. The final winner is the strongest of these factors. So this is rank 1 (green arrow).
Second, rank 2 is the second strongest, so of course you should have lost to this champion somewhere. You can search for this from the tuple you saved earlier (the part surrounded by red and the red arrow). Therefore, if you reposition these elements in a tournament format and compete (see the figure below), Rank 2 will definitely win.
After that, Rank 3 must have been defeated by Rank 2 in the previous battles (the part surrounded by blue and the blue arrow), so I will repeat this. By doing this, it is stored in the array rank from the highest rank, and you can see that you can continue until there are no more elements.
I wrote in Python what implements the above algorithm.
As I wrote in the description, this was written for self-analysis (laughs) I wanted something that would eventually become a ranking when I listed things of the same genre and answered which one I liked. is.
--Argument management with argparse. --As you can see in the last 7 lines, the function is executed recursively.
One thing to keep in mind is when you can't play a tournament with one element and when the final end is.
The function compare, which is commented out in the ~~ class Rank, was used when it was troublesome to input each time at the test stage, and was also used to measure the amount of calculation shown later. Other print commentouts are also for testing. If you display this, it will be easier to understand how it works. ~~
(Addition) For commenting out print, by adding "-v" to the option, the display is divided according to the truth of self.verbose. There seems to be a better way. Compare also switches between default_compare and auto_compare with the option "-t".
ranking.py
#!/usr/bin/env python
# -*- coding:utf-8 -*-
#
# written by ssh0, September 2014.
description = """
Ranking interface for self-analysis.
In this application
Create a comparative question based on the given list
The ranking is completed when the user answers it.
"""
import argparse
import sys
class Rank:
def __init__(self, args):
self.pair = [] # match pairs
self.rank = [] # sorted list
self.args = args
self.verbose = self.args.verbose
if self.args.test:
self.compare = self.auto_compare
else:
self.compare = self.default_compare
self.N = 0
def default_compare(self, a, b):
self.N += 1
print '\nQ[%d] which ones do you like?' % self.N
print ' [j]: %s, [k]: %s' % (a, b)
key_input = raw_input(">>> ")
if key_input == 'j':
ans = (a, b)
elif key_input == 'k':
ans = (b, a)
else:
raise ValueError('please select by "j" or "k".')
return ans
def auto_compare(self, a, b): #For measurement
self.N += 1
ans = (max(a, b), min(b, a))
return ans
def first_matching(self, data):
n = len(data)
N = 1
if n == 0:
self.func_finally()
elif n == 1:
self.rank.append(data[0])
if self.verbose:
print "n = 1 ::",
print self.rank
return self.first_matching(self.narrowing(data[0]))
else:
while n >= N:
N = N*2
unseed = 2*n - N
first_winner = []
for i in range(unseed/2):
a, b = data.pop(0), data.pop(0)
ans = self.compare(a, b)
g, l = ans[0], ans[1]
first_winner.append(g)
self.pair.append((g, l))
data += first_winner
member = [(data[2*i], data[2*i+1]) for i in range(len(data)/2)]
return member
def tournament(self, data):
member = self.first_matching(data)
if self.verbose:
print "next tournament",
print member
while len(member) > 1:
winner = []
for p in member:
ans = self.compare(p[0], p[1])
g, l = ans[0], ans[1]
self.pair.append((g, l))
winner.append(g)
member = [(winner[2*i], winner[2*i+1]) for i
in range(len(member)/2)]
ans = self.compare(member[0][0], member[0][1])
top, l = ans[0], ans[1]
self.pair.append((top, l))
self.rank.append(top)
if self.verbose:
print "pairs",
print self.pair
print "champion",
print top
print "current rank",
print self.rank
return top
def narrowing(self, x):
unsort_table = []
for_delete = []
for i, pair in enumerate(self.pair):
if x == pair[0]:
unsort_table.append(pair[1])
for_delete.append(i)
for j in for_delete:
self.pair[j] = None
self.pair = [k for k in self.pair if k is not None]
if self.verbose:
print "unsort_table",
print unsort_table
return unsort_table
def func_finally(self):
result = ''
for i, x in enumerate(self.rank):
result += ' %d: %s\n' % (i+1, x)
if self.args.outfile:
outfile = self.args.outfile
with open(outfile, 'a') as out:
out.write(result)
if not self.args.test:
print result
print "these result is saved in '%s'." % outfile
else:
print self.N
elif self.args.test:
print self.N
else:
print result
sys.exit()
if __name__ == '__main__':
parse = argparse.ArgumentParser(description=description)
parse.add_argument('-l', '--list',
dest='obj',
nargs='*',
type=str,
help='list of some objects',
default=argparse.SUPPRESS
)
parse.add_argument('-o', '--output',
dest='outfile',
type=str,
help='OPTIONAL: file to save output',
default=None
)
parse.add_argument("-v", "--verbose",
help="increase output verbosity",
action="store_true"
)
parse.add_argument("-t", "--test",
help="auto comparing mode (int, descending) for test",
action="store_true"
)
args = parse.parse_args()
data = args.obj
rank = Rank(args)
def ranking(pair_lists):
result = rank.narrowing(rank.tournament(pair_lists))
return result
a = ranking(data)
while True:
a = ranking(a)
The execution example is as follows.
➤ python ranking.py -l apple orange banana melon berry
Q[1] which ones do you like?
[j]: apple, [k]: orange
>>> j
Q[2] which ones do you like?
[j]: banana, [k]: melon
>>> j
Q[3] which ones do you like?
[j]: berry, [k]: apple
>>> j
Q[4] which ones do you like?
[j]: banana, [k]: berry
>>> k
Q[5] which ones do you like?
[j]: apple, [k]: banana
>>> j
Q[6] which ones do you like?
[j]: orange, [k]: banana
>>> j
1: berry
2: apple
3: orange
4: banana
5: melon
Since it is difficult to tell if the numbers are sorted properly from the above example, I will display the debug display and show the case where the numbers from 0 to 9 are sorted in descending order. Pairs is the list of tuples in the figure, and unsorted_table is the elements of the next tournament specified by the arrows in the figure. In the latter half, the number of elements of unsorted_table may become 1, so it is necessary to separate the processing. Ends the process when unsorted_table is empty.
➤ python ranking.py -l 1 0 6 3 4 9 5 7 8 2 -v
Q[1] which ones do you like?
[j]: 1, [k]: 0
>>> j
Q[2] which ones do you like?
[j]: 6, [k]: 3
>>> j
next tournament [('4', '9'), ('5', '7'), ('8', '2'), ('1', '6')]
Q[3] which ones do you like?
[j]: 4, [k]: 9
>>> k
Q[4] which ones do you like?
[j]: 5, [k]: 7
>>> k
Q[5] which ones do you like?
[j]: 8, [k]: 2
>>> j
Q[6] which ones do you like?
[j]: 1, [k]: 6
>>> k
Q[7] which ones do you like?
[j]: 9, [k]: 7
>>> j
Q[8] which ones do you like?
[j]: 8, [k]: 6
>>> j
Q[9] which ones do you like?
[j]: 9, [k]: 8
>>> j
pairs [('1', '0'), ('6', '3'), ('9', '4'), ('7', '5'), ('8', '2'), ('6', '1'), ('9', '7'), ('8', '6'), ('9', '8')]
champion 9
current rank ['9']
unsort_table ['4', '7', '8']
Q[10] which ones do you like?
[j]: 4, [k]: 7
>>> k
next tournament [('8', '7')]
Q[11] which ones do you like?
[j]: 8, [k]: 7
>>> j
pairs [('1', '0'), ('6', '3'), ('7', '5'), ('8', '2'), ('6', '1'), ('8', '6'), ('7', '4'), ('8', '7')]
champion 8
current rank ['9', '8']
unsort_table ['2', '6', '7']
Q[12] which ones do you like?
[j]: 2, [k]: 6
>>> k
next tournament [('7', '6')]
Q[13] which ones do you like?
[j]: 7, [k]: 6
>>> j
pairs [('1', '0'), ('6', '3'), ('7', '5'), ('6', '1'), ('7', '4'), ('6', '2'), ('7', '6')]
champion 7
current rank ['9', '8', '7']
unsort_table ['5', '4', '6']
Q[14] which ones do you like?
[j]: 5, [k]: 4
>>> j
next tournament [('6', '5')]
Q[15] which ones do you like?
[j]: 6, [k]: 5
>>> j
pairs [('1', '0'), ('6', '3'), ('6', '1'), ('6', '2'), ('5', '4'), ('6', '5')]
champion 6
current rank ['9', '8', '7', '6']
unsort_table ['3', '1', '2', '5']
next tournament [('3', '1'), ('2', '5')]
Q[16] which ones do you like?
[j]: 3, [k]: 1
>>> j
Q[17] which ones do you like?
[j]: 2, [k]: 5
>>> k
Q[18] which ones do you like?
[j]: 3, [k]: 5
>>> k
pairs [('1', '0'), ('5', '4'), ('3', '1'), ('5', '2'), ('5', '3')]
champion 5
current rank ['9', '8', '7', '6', '5']
unsort_table ['4', '2', '3']
Q[19] which ones do you like?
[j]: 4, [k]: 2
>>> j
next tournament [('3', '4')]
Q[20] which ones do you like?
[j]: 3, [k]: 4
>>> k
pairs [('1', '0'), ('3', '1'), ('4', '2'), ('4', '3')]
champion 4
current rank ['9', '8', '7', '6', '5', '4']
unsort_table ['2', '3']
next tournament [('2', '3')]
Q[21] which ones do you like?
[j]: 2, [k]: 3
>>> k
pairs [('1', '0'), ('3', '1'), ('3', '2')]
champion 3
current rank ['9', '8', '7', '6', '5', '4', '3']
unsort_table ['1', '2']
next tournament [('1', '2')]
Q[22] which ones do you like?
[j]: 1, [k]: 2
>>> k
pairs [('1', '0'), ('2', '1')]
champion 2
current rank ['9', '8', '7', '6', '5', '4', '3', '2']
unsort_table ['1']
n = 1 :: ['9', '8', '7', '6', '5', '4', '3', '2', '1']
unsort_table ['0']
n = 1 :: ['9', '8', '7', '6', '5', '4', '3', '2', '1', '0']
unsort_table []
1: 9
2: 8
3: 7
4: 6
5: 5
6: 4
7: 3
8: 2
9: 1
10: 0
By the way, since this program itself is premised on human input, I would like the number of comparisons to be the smallest. So, I measured how many comparison operations were performed on N elements.
Prepare a test script as shown below,
test_ranking.py
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
import commands
Ns = [10*i for i in range(1, 11)]
cal = []
for N in Ns:
raw = [i for i in range(N)]
x = np.random.choice(raw, N, replace=False)
lists = ''
for _x in x:
lists += str(int(_x)) + ' '
command = "python ranking.py -l %s -t" % lists
result = commands.getoutput(command)
cal.append(result)
f = plt.figure()
ax = f.add_subplot(111)
ax.plot(Ns, cal)
ax.set_xlabel(r"$N$")
ax.set_ylabel("calculate times")
plt.show()
The graph obtained as a result of execution is shown in the figure below. The horizontal axis represents the number of elements N (in 10 increments), and the vertical axis represents the number of comparison operations.
[Theoretical Limits of Comparative Sorting (Wikipedia)](http://ja.wikipedia.org/wiki/%E3%82%BD%E3%83%BC%E3%83%88#.E6.AF.94.E8 .BC.83.E3.82.BD.E3.83.BC.E3.83.88.E3.81.AE.E7.90.86.E8.AB.96.E9.99.90.E7.95.8C) is $ O ( For n \ log n) $, $ 100 \ log 100 \ simeq 460.5 $, and $ N = 100 $, the amount of comparison operation was 626, so it seems that the order cannot be made smaller. Conversely, I think that the algorithm created this time was a good comparison sorting algorithm.
The tournament sort itself should have been implemented somewhere, so if you define the comparison operator yourself, you might have been able to do the same thing. However, I am personally satisfied because I was able to think about it myself. I don't care if it's called reinventing the wheel. (If you are familiar with methods, please let me know)
that? What is self-analysis?
Recommended Posts