[PYTHON] Why my redis was so slow

Is Redis a KVS palm cho?

There was a time when I thought so. If you don't use Redis properly, the performance will drop to 1/10 or less. In this article, we measured the performance of implementations that can and cannot perform redis.

Conclusion

First conclusion

For reading and writing Redis, the original performance can be demonstrated by using commands such as mget / mset that read and write multiple data at once. On the contrary, if you simply use `` `for``` and use a command such as get / set to rotate the loop one by one, a large performance degradation will occur.

environment

Measure with python3.

redis was run on docker.

$ docker run --name redis -d -p 6379:6379 redis:3.2.8
$ redis-cli
127.0.0.1:6379> ping

I installed redis-cli separately, but I will only use it for this communication confirmation in the article.

measure

I will write 10000 data and measure it. For the purpose of matching the conditions, redis is cleaned and measured by flushdb () before writing the data. Also, the timeit module of python summarizes the execution speed at one time.

Try to write one by one in loop

$ cat forloop.py 
import timeit
code = '''
import redis
r = redis.StrictRedis()
r.flushdb()
for i in range(10000):
	r.set(i, i)
'''
print(timeit.timeit(code,number=1))
$ python forloop.py 
5.071391730001778

Approximately 5 seconds after writing 10000 times. It's 0.5msec each time, but I didn't think much at this point until it was slow ...

100 times for 100 cases

In order to write 10000 items, I will summarize it in "100 items each 100 times". In the redis command, set becomes mset.

$ cat chunk.py 
import timeit
code = '''
import redis
r = redis.StrictRedis()
r.flushdb()
for i in range(100):
    kv = {str(j):j for j in range(i*100,(i+1)*100)}
    r.mset(kv)
'''
print(timeit.timeit(code,number=1))
seiket-mba2011:sandbox seiket$ python chunk.py 
0.2815354660015146

Approximately 0.28 sec to write 10000 records 0.028msec per case ** 10 times faster than writing one by one **. Does it make a difference here? !!

Write 10000 cases in 1 shot

Since it's a big deal, write 10,000 items at once

$ cat all.py 
import timeit
code = '''
import redis
r = redis.StrictRedis()
r.flushdb()
kv = {str(j):j for j in range(10000)}
r.mset(kv)
'''
print(timeit.timeit(code,number=1))
seiket-mba2011:sandbox seiket$ python all.py 
0.22943834099714877

It's even faster.

Measurement results and summary

The writing speed to redis is overwhelmingly different when written together. The following differences were made for 10000 writes.

How to write Time required for writing 10,000 records (sec)
10000 times one by one 5.071
100 times for 100 cases 0.281
10000 cases each 0.229

As you can see from the table above, writing to redis all at once improves performance by ** 10 times or more ** (or ** writing one by one results in performance degradation of 1/10 or less ** ) Occurs. If anything, I think ** performance degradation ** is more appropriate.

Although not mentioned in the article, there is a similar tendency when reading, so it is overwhelmingly better to read all at once in terms of performance. Especially when using redis for speed, it is fatally slow to access one by one.

gist

The source code used for the measurement is the same as in the article, but it is also shown below. https://gist.github.com/seiketkm/3ca7deb28e8b0579a6e443cc7f8f9f65

Recommended Posts

Why my redis was so slow
Why django-import-export import is so slow and what to do