I won't have a hard time anymore. .. Multi-process using Python's standard library.

Overview

Multi-process, let's do it easily. But I was a little stuck in the way and I couldn't save time after all. .. ?? ??

It's impossible. .. .. .. .. Ordinarily for sentence. .. .. ..

I want such a person to read it.

Use Python's standard ProcessPool Executor and functools.partial to comfortably multi-process parallel calculations. What you can use is a loss if you don't use it, right? ??

Situation explanation

test.py



def my_function(arg1):
    for i in range(100):
        'It ’s a very heavy process.'
        result_list.append(some_operation(i))

if __name__ == "__main__":
    my_function()

You write this way without thinking about anything, right? In such a case, there are 100 processes, but I want to easily perform parallel processes in multiple processes. .. ..

Easy way

  1. Rewrite the function a bit.
  2. Import ProcessPoolExecutor.
  3. Import functools.partial and map the function.

1. Rewrite the function a bit.

Write the previous my_function like this.

test.py


def my_function(index, arg1):
    return some_operation(index)

In other words, instead of writing a loop, it's like passing an index and processing it one by one.

2. Import ProcessPoolExecutor.

First, write as follows.

test.py


import os
from concurrent.futures import ProcessPoolExecutor

max_workers = os.cpu_count() or 4
print('=====MAX WORKER========')
print(max_workers)

with ProcessPoolExecutor(max_workers=max_workers) as executor:

max_workersStores the number of available processes. If it cannot be stored, in this example4Is entered.

3. Import functools.partial and map the function.

Next, write like this.

test.py


import functools

with ProcessPoolExecutor(max_workers=max_workers) as executor:
    result_list = list(executor.map(functools.partial(my_function, arg1), range(100)))
    executor.shutdown(wait=True)

Only this. To explain a little, the operations that I used to write in a loop and append to `` `result_list my_funtion```To execute the operation at each index in multiple processes by passing the index to.

functools.partial is a function (my_Change some of the arguments of function) and return each function with different arguments as a function iterator. At this time, my_function(index, arg1)As you can see, there is only one argument that can be changed, which is the first argument of the function.**Caution!!**



 Furthermore, executor.map executes the index, which is the first variable passed to the `` my_function```, in parallel in each of the iterator range (100), and stores each execution result in the list.
 This is exactly the same as when I was appending to ``` result_list``` in each loop.


## Summary
 Multi-process that you want to do to save even a little time. It's painful to end up having a hard time writing multi-processes and taking a lot of time. .. ..
 This time, I introduced a multi-process that you can write using only the Python standard library quite easily! !!

 end.



Recommended Posts

I won't have a hard time anymore. .. Multi-process using Python's standard library.
I tried using a library (common thread) that makes Python's threading package easier to use
Time measurement using a clock
I tried to make a regular expression of "time" using Python
I stopped my instance at a specific time using AWS Lambda