Python Mathematical Optimization Package Make PuLP's Calculation Engine Faster (Windows Version Easy Edition) & Specify Calculation Time

Introduction

Among mathematical optimization problems (mathematical planning problems), as a Python package that solves linear optimization problems (linear planning problems) and integer optimization problems (integer planning problems) (* but all formulas are linear) There is ** PuLP **.

--Reference Qiita article: Mathematical optimization for free work with Python + PuLP

PuLP has a modeling interface part that allows you to describe the object you want to solve as a mathematical expression on Python, and a calculation engine part called ** mathematical optimization solver ** that solves what you describe as an expression. PuLP has a version of COIN-CBC (Official Site (GitHub)) that can be used commercially for free as a mathematical optimization solver. In addition to being done, you can also call other mathematical optimization solvers that you have obtained yourself. This means that if you prepare a mathematical optimization solver with higher performance than COIN-CBC, you can solve it in a shorter calculation time, but ** among the solvers available for free and commercial use ** COIN-CBC The only faster one is MIPCL, and as of May 2020, the availability of that MIPCL has become a mystery.

--Reference Qiita article: A free mathematical optimization solver is finally available? MIPCL

By the way, the algorithm that solves the integer optimization problem can perform parallel processing, especially for the part called the branch-and-bound method, which can considerably reduce the calculation time. Paid solvers, MIPCL, and COIN-CBC distributed on the official website work with multithreading, but ** COIN-CBC included with PuLP is multithreaded, at least for the Windows version. Doesn't work on **.

Therefore, we will introduce a method to reduce the calculation time by changing the mathematical optimization solver called from PuLP to the multi-thread compatible version of COIN-CBC. This time, as a ** bulletin version, we will deal with an easy method **. By the way, I will also touch on ** How to specify the upper limit of calculation time **.

Experiment environment

download

According to the article Mathematical optimization that can be used for free work with Python + PuLP, it is assumed that PuLP has been downloaded and installed, and a light operation check has been completed. (In the optimization calculation, unless otherwise specified, the same version of PuLP, COIN-CBC, is used).

This time, as an additional task, download Cbc-master-x86_64-w64-mingw32.zip at https://bintray.com/coin-or/download/Cbc/master#files. After downloading, it is a zip file, so unzip it.

--This binary seems to have been built from the master branch. If you search the above site or COIN-CBC official site (GitHub), you can find the binary with the version number, but that Seems to have to resolve the library dependencies themselves. --The one built from the master branch is not always a stable version with a fixed version, so I'm worried about using it ... It's more self-responsible. --Other binaries such as Cbc-master-win64-msvc16-mt.zip (compiled with Visual Studio 2019?) Did not support multithreading for some reason.

code

Assume that the COIN-CBC you downloaded and unzipped is located in C: \ Users \ (username) \ Desktop \ Cbc-master-x86_64-w64-mingw32 \ bin \ cbc.exe.

Example of this article (2) If you take the code pulp_problem_2.py and explain it,

pulp_problem_2.py


#(Omitted)
#Calculation
#Solver designation
solver = pulp.PULP_CBC_CMD()
#(Omitted)

Part of

pulp_problem_2_mt.py


#Solver designation
solver = pulp.COIN_CMD(path=r'C:\Users\(username)\Desktop\Cbc-master-x86_64-w64-mingw32\bin\cbc.exe', threads=8, maxSeconds=120.5)

If you change it to, it will work with multithreading.

--path = (COIN-CBC binary path) specifies where the COIN-CBC binary to call is located. --The string with r at the beginning is raw string, which specifies the path of Windows files and folders. It is convenient to use when. --threads = (number of threads) to specify the number of threads to use with COIN-CBC. I think the following benchmark results will be helpful as to how many are good. --Although it is out of the subject of this article, you can specify the upper limit of the calculation time with maxSeconds = (calculation time upper limit (seconds)). --The calculation may be completed within the specified time. In that case, the optimal solution is returned (if the instance has an optimal solution), and it is theoretically guaranteed through calculation that the solution is the optimal solution. -If the instance is unbounded (the objective function can be made infinitely small (large)) or infeasible (there is no solution that satisfies the constraint), it turns out that this is the case within the specified time. It will be.) --If the specified time is reached before the calculation is completed, ** the best solution found so far ** will be returned. ――The returned solution may or may not be the optimal solution. Even if it is the optimal solution, it is not theoretically guaranteed to be the optimal solution. --As one of the uses of this "return the solution found within the specified time" function, instead of building a (metaheuristic) code for the problem on your own and searching for a reasonable solution within the specified time It is possible to throw it at a mathematical optimization solver and have it return a reasonably good solution found within the same time **. --It is possible that no solution can be found within the specified time. In that case, there may be no feasible solution (solution that satisfies the constraint) in the first place, or there may be one, but no mathematical optimization solver has been found.

benchmark

Previous article, as well as graph clustering (the problem of splitting a given undirected graph into several dense subgraphs) This paper I tried to solve the formulation. The environment is Intel Core i9-9900K (** 8 cores 16 threads **). The impression of solving an instance based on the formulation of this paper is that it takes some time to find the optimal solution, but more than that, it takes a lot of time to ensure that the found solution is optimal. It can be said that it is such a type.

instance PuLP 2.1 Included COIN-CBC (1 thread) 2020/05/18th edition COIN-CBC (1 thread) 2020/05/18th edition COIN-CBC (8 thread) 2020/05/18th edition COIN-CBC (16 threads)
1 21.7 22.7 6.9 13.4
2 10.1 11.2 4.3 5.6
3 28.0 20.3 7.2 10.3
4 9.8 19.8 5.4 7.4

(Unit: seconds) (In general, the larger the instance number, the larger the instance size)

First, comparing the COIN-CBC binary that comes with PuLP 2.1 that runs on one thread and the COIN-CBC binary that you got this time, the former solves the instance in a shorter time. .. The former binary is in C: \ Users \ (user name) \ Anaconda3 \ Lib \ site-packages \ pulp \ solverdir \ cbc \ win \ 64 \ cbc.exe in this experimental environment. When I run this from the console, it says Version: 2.9.0 Build Date: Feb 12 2015. The 2020/05/18 version seems to be a little over two months from version 2.10.5. In mathematical optimization solvers, as the version goes up, there are often instances where the tuning changes in the solver take longer to solve. When the COIN-CBC binary obtained this time was operated with 8 threads, which is the number of physical cores of the CPU, the instance was solved in at least 40% of the time compared to the case of 1 thread. Since the number of cores is 1: 8, the calculation time was not reduced to 12.5%, but the result is quite good. When operating with 16 threads, which is the number of logical cores, exceeding the number of physical cores, it took longer than when operating with 8 threads. Since integer optimization is a relatively heavy process, it appears as a separate logical core due to the SMT function, but it is presumed that performance has declined due to competition for arithmetic units within the physically same core. Will be done. This phenomenon also occurs, for example, with LINPACK, which is known as a matrix operation benchmark. It seems that recent paid solvers automatically check the characteristics around the SMT of the CPU and automatically adjust the number of threads generated, but the COIN-CBC I got this time does not seem to have such a function, so I myself It seems better to explicitly set ** number of threads = number of physical cores **. Regarding which logical core the processing of each thread is assigned to, in the case of Windows, it seems that the OS side has been allocating it to different physical cores since Vista or 7.

Summary

--The mathematical optimization solver that comes with PuLP by default works only with one thread, so if you download the multi-thread operation version and specify and call as many threads as the number of physical cores of the CPU, you can solve the problem quickly. --PuLP allows you to specify an upper limit on the calculation time. The calculation may be completed within the specified time. If the specified time is reached without completing the calculation, the best solution found so far will be returned.

What I want to do from now on

--I want to compile the multithreaded operation version from source. --I want to call the DLL (compiled and generated from the sole) instead of calling cbc.exe. --In the case of the method of calling cbc.exe, PuLP outputs the instance to a file in the format called .mps → cbc.exe reads the .mps file and calculates → cbc.exe solves the solution .sol Output to a file in the format called → PuLP reads the .sol file and sets the solution, but in the case of a large-scale problem, it takes time to input and output the file. --PuLP comes with CoinMP.dll by default, but it's a 32-bit version and doesn't seem to work in a 64-bit environment? -I would like to investigate whether COIN-CBC attached to Google OR-Tools works with multithreading. -[2020/06/05 postscript] I tried 7.6.7691 of Python version of OR-Tools for Windows, but it did not work with multithreading. If you read the issues on GitHub below, Windows version ** only **? It seems that it does not support multithreading. As a workaround, it seems that you can get the multi-threaded version of CBC, edit the Makefile and build it. - https://github.com/google/or-tools/issues/1656 - https://github.com/google/or-tools/issues/1805

Recommended Posts

Python Mathematical Optimization Package Make PuLP's Calculation Engine Faster (Windows Version Easy Edition) & Specify Calculation Time
[2021 version] Python installation Windows 10 (64bit) edition