Launch multiple processes that take several days on a small server with 4 CPU memory and 4G When I was calculating with 100% CPU x 4, I ran out of memory in a few hours. CPU usage has dropped sharply. Apparently it was running out of memory.
Increased processing time Nitomonai, increased process memory usage → Memory exhaustion → Memory swap occurs → Memory access speed deteriorates significantly → Memory waiting CPU usage is 1% or less → It never ends (significant deterioration in performance)
In most cases, increasing the number of servers or improving the performance of the server will solve the problem. I didn't select it because it was a personal PJ, but when I switched to a server with about 32G of memory I wonder if that was the only solution.
Memory management of python is fully automatic, leaving it to the basic VM. The only solution to a memory leak is to kill the process Idempotent processing was given, and the part where processing of 8 types of categories was performed with one command was divided.
class Category(Enum):
A = 1
B = 2
C = 3
for category in Category:
benchmark(category)
category = manage.get_category_by_priority()
benchmark(category)
The code for improvement 1 had to be executed in 8 steps, so if the process stopped It was necessary to start up again. It is convenient to use supervisor in such a case
shell
easy_install supervisor
echo_supervisord_conf > /etc/supervisord.conf
supervisord
supervisord status
alias sc='supervisorctl'
sc restart
sc reread
sc stop all
sc status
sc restart all
I'm not familiar with python's GC, so it may have side effects. So far, the memory leak has been resolved and it is stable. There is a possibility of black magic, so I can't recommend it very much.
If you use a lot of class cache, memory leaks occur frequently in python2 series. Will python3 solve the problem of steadily increasing memory consumption?
class Category(Enum):
A = 1
B = 2
C = 3
for category in Category:
benchmark(category)
def benchmark(category):
bulk = []
tmp_data = Tmp.get_all()
for _tmp in tmp_data:
bulk.append(calc(_tmp))
DBTable.bulk_create(bulk) #Bulk!
#Memory release
import gc
del tmp_data
del bulk
gc.collect()
Reference: gc — Garbage collector interface http://docs.python.jp/2/library/gc.html
Recommended Posts