Easy parallel execution with python subprocess

-I want to process a large amount of data in parallel. -I want to make effective use of the CPU because it has many cores. -I tried writing the script for the time being, but I can't afford to rewrite it for parallel execution.

I researched various things and it seemed easy to use subprocess.Popen (), so I experimented.

Reference article https://qiita.com/HidKamiya/items/e192a55371a2961ca8a4

Execution environment and sample code for load test

Windows 10 (64bit) Python 3.7.6

The CPU is Ryzen 9 3950X and this kind of environment. タスク マネージャー 2020_06_27 20_15_14.png

Experiment with the code below. (Only display the last two digits of the Fibonacci sequence specified by the command line argument.)

fib_sample.py


import sys

def fib(n):
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
    return a

if __name__ == "__main__":
    num = int(sys.argv[1])
    result = fib(num)
    print("n = {0}, result % 100 = {1}".format(num, result % 100))

For example, if you run python fib_sample.py 10000, it will display n = 10000, result% 100 = 75 and exit.

Run sequentially

First, try executing it sequentially with subprocess.run (). Execute python with arguments in subprocess.run (['python', r". \ Fib_sample.py ", str (500000 + i)]). If you change the command line argument from 500000 to 500063 and execute it 64 times,

batch_sequential.py


from time import time
import subprocess

start=time()

loop_num = 64
for i in range(loop_num):
    subprocess.run(['python', r".\fib_sample.py", str(500000 + i)])

end=time()
print("%f sec" %(end-start))
> python .\batch_sequential.py
n = 500000, result % 100 = 25
n = 500001, result % 100 = 26
n = 500002, result % 100 = 51
(Omitted)
n = 500061, result % 100 = 86
n = 500062, result % 100 = 31
n = 500063, result % 100 = 17
130.562213 sec

It took a little over two minutes. Obviously, the CPU core is not used at all. タスク マネージャー 2020_06_27 20_12_28.png

Parallel execution

Try to execute the same process in parallel with subprocess.Popen (). Unlike subprocess.run (), subprocess.Popen () does not wait for the spawned process to finish. The following code repeats the process execution → wait for all of them to finish → the next process execution → ... for the number specified by max_process.

batch_parallel.py


from time import time
import subprocess

start=time()

#Maximum number of parallel process executions
max_process = 16
proc_list = []

loop_num = 64
for i in range(loop_num):
    proc = subprocess.Popen(['python', r".\fib_sample.py", str(500000 + i)])
    proc_list.append(proc)
    if (i + 1) % max_process == 0 or (i + 1) == loop_num:
        #max_Wait for the end of all processes for each process
        for subproc in proc_list:
            subproc.wait()
        proc_list = []

end=time()
print("%f sec" %(end-start))

The result of executing 16 in parallel according to the number of physical cores of Ryzen 3950X.

> python .\batch_parallel.py
n = 500002, result % 100 = 51
n = 500004, result % 100 = 28
n = 500001, result % 100 = 26
(Omitted)
n = 500049, result % 100 = 74
n = 500063, result % 100 = 17
n = 500062, result % 100 = 31
8.165289 sec

Since they are executed in parallel, the order in which the processing ends is out of order. It was almost 16 times faster from 130.562 seconds to 8.165 seconds.

You can see that all the cores are used and they are executed correctly in parallel. タスク マネージャー 2020_06_27 20_22_37.png

By the way, it is not quick to execute 32 in parallel according to the number of logical cores instead of the number of physical cores. Rather sometimes it gets late. The graph below shows the average execution time when the number of parallel executions is changed and executed three times each. プレゼンテーション1.png I've been running various applications in the background so it's not very accurate, but I think the trend is correct.

Summary

It was fairly easy to speed up. With the above code, after starting parallel execution, it waits for the end of the process with the longest processing time, so if there is a process that happens to have a long processing time, the overhead of waiting for that process will increase. Originally, I think that the code should be such that the number of parallels is always constant by starting the next process as soon as the execution of each process is completed. However, in reality, the purpose was to speed up the application, such as applying a large amount of data files of the same length to the same signal processing, so I closed my eyes with the expectation that the execution time would be about the same. For the time being, I was satisfied because I was able to easily achieve the purpose of speeding up.

Recommended Posts

Easy parallel execution with python subprocess
[Python] Easy parallel processing with Joblib
[Co-occurrence analysis] Easy co-occurrence analysis with Python! [Python]
Easy folder synchronization with Python
Execution time measurement with Python With
Easy Python compilation with NUITKA-Utilities
Easy HTTP server with Python
python parallel / asynchronous execution memorandum
Easy Python + OpenCV programming with Canopy
Bayesian optimization very easy with Python
Read files in parallel with Python
Easy data visualization with Python seaborn.
Easy modeling with Blender and Python
Easy keyword extraction with TermExtract for Python
[Python] Super easy test with assert statement
[Python] Easy argument type check with dataclass
Easy introduction of speech recognition with Python
[Easy Python] Reading Excel files with openpyxl
Easy image processing in Python with Pillow
[Easy Python] Reading Excel files with pandas
Easy web scraping with Python and Ruby
Parallel task execution using concurrent.futures in Python
[Python] Easy Reinforcement Learning (DQN) with Keras-RL
FizzBuzz with Python3
Scraping with Python
Python is easy
Statistics with python
Scraping with Python
Python with Go
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
with syntax (Python)
Bingo with python
Zundokokiyoshi with python
Excel with Python
Microcomputer with Python
Cast with python
[Python] Easy introduction to machine learning with python (SVM)
Prepare the execution environment of Python3 with Docker
Quit Python execution with Ctrl-C (responds to SIGINT)
How to measure execution time with Python Part 1
Parallel processing with no deep meaning in Python
Build PyPy and Python execution environment with Docker
Easy Lasso regression analysis with Python (no theory)
Make your Python environment "easy" with VS Code
Build a python execution environment with VS Code
How to measure execution time with Python Part 2
✨ Easy with Python ☆ Estimated elapsed time after death ✨
Serial communication with Python
Zip, unzip with python
Django 1.11 started with Python3.6
Primality test with Python
Socket communication with Python
Data analysis with python 2
Easy Grad-CAM with pytorch-gradcam
Try scraping with Python.
Learning Python with ChemTHEATER 03