There are several ways to determine a prefix match for a string that is Ptyhon. Among them, the following three typical speed comparisons are performed.
Implement in the following execution environment
item | value |
---|---|
Python Version | 3.8.2 |
OS | Ubuntu 20.04 |
Check the operation based on the following program. The roles of each variable and each function are as follows. Change the variable according to the characteristics you want to measure.
variable/function | Description |
---|---|
time_logging | Decorator for measuring time |
compare_regex | Compare each of the list of argument strings with a regular expression |
compare_startswith | Each of the list of argument stringsstartswith Compare by method |
compare_str | The first string in each of the list of argument strings istarget_word Compare if equal to |
target_word | Character string to be compared |
match_word | target_word String prefix that matches |
not_match_word | target_word String prefix that does not match |
compare_word_num | Total number of strings to compare |
compare_func | Function to measure |
main | Function to be called |
import re
import time
def time_logging(func):
def deco(*args, **kwargs):
stime = time.time()
res = func(*args, **kwargs)
etime = time.time()
print(f'Finish {func.__name__}. Takes {round(etime - stime, 3)}s.', flush=True)
return res
return deco
@time_logging
def compare_regex(compare_words):
pattern = re.compile(f'^{target_word}')
for word in compare_words:
if pattern.match(word):
pass
@time_logging
def compare_startswith(compare_words):
for word in compare_words:
if word.startswith(target_word):
pass
@time_logging
def compare_str(compare_words):
length = len(target_word)
for word in compare_words:
if word[:length] == target_word:
pass
target_word = f'foo'
match_word = f'{target_word}'
not_match_word = f'bar'
compare_word_num = 100_000_000
match_rate = 50
compare_func = compare_regex
def main():
compare_words = []
for index in range(compare_word_num):
if index % 100 <= match_rate:
compare_words.append(f'{match_word}_{index}')
else:
compare_words.append(f'{not_match_word}_{index}')
compare_func(compare_words)
if __name__ == '__main__':
main()
Since the tendency of execution speed may change depending on the length of the character string to be compared,
Measure the execution speed of compare_regex
, compare_startswith
, and compare_str
when target_word
is changed to 5, 10, 50, 100, and 500 characters, respectively.
Unit (seconds)
function\word count | 5 | 10 | 50 | 100 | 500 |
---|---|---|---|---|---|
compare_regex | 11.617 | 12.044 | 16.126 | 18.837 | 66.463 |
compare_startswith | 6.647 | 6.401 | 6.241 | 6.297 | 6.931 |
compare_str | 5.941 | 5.993 | 4.87 | 5.449 | 8.875 |
In terms of speed, it should be implemented with starts with
or str [: word_length]
for any number of characters. The most recommended is starts with
, which is the least affected by the string to be compared.
I also like it the most in terms of readability.
Recommended Posts