[PYTHON] Comparison of Hungarian law and general-purpose solver for allocation problems

what is this

There was an article in "Assignment problem, Hungarian law and integer programming problem" that the general-purpose solver is slow. In this article, I'll show you that it's faster to modify the code a bit and "create a mathematical model and solve it with a general-purpose solver".

Python code

The code is below. You need pip install numpy pulp munkres to run it.

import random
import time

import numpy as np
from munkres import Munkres
from pulp import PULP_CBC_CMD, LpProblem, LpVariable, lpDot, lpSum, value


class AssigmentProblem:
    def __init__(self, size, seed=0):
        self.size = size
        random.seed(seed)
        rng = range(self.size)
        self.weight = np.array([[random.randint(1, 100) for i in rng] for j in rng])

    def solve_hungarian(self):
        start_tm = time.time()
        m = Munkres()
        result = m.compute(self.weight.tolist())
        val = sum([self.weight[i, j] for i, j in result])
        tm = time.time() - start_tm
        print(f"hungarian {tm = :.2f} {val = }")

    def solve_pulp(self):
        m = LpProblem("AssignmentProblem")
        rng = range(self.size)
        x = np.array(LpVariable.matrix('x', (rng, rng), cat='Binary'))
        m += lpDot(self.weight.flatten(), x.flatten())
        start_tm = time.time()
        for i in rng:
            m += lpSum(x[i]) == 1
            m += lpSum(x[:, i]) == 1
        m.solve(PULP_CBC_CMD(mip=False, msg=False))
        val = value(m.objective)
        tm = time.time() - start_tm
        print(f"pulp      {tm = :.2f} {val = }")


if __name__ == "__main__":
    p1 = AssigmentProblem(300)
    p1.solve_hungarian()
    p1.solve_pulp()

Execution result

hungarian tm = 2.43 val = 352
pulp      tm = 1.94 val = 352.0

As you can see, the general-purpose solver is faster.

Reference: Python in optimization

Main correction points

――In the original article, it took time ** to create a mathematical model **. The reason is that I used sum instead of lpSum. sum creates wasted memory and is slow. --Please also refer to "Sum of variables in mathematical model". --The variables of the mathematical model are 0-1 binary variables, but they are solved as continuous variables. Because the adjacency matrix of the assignment problem is all unimodular, linear relaxation gives an integer solution. --Reference: https://www.weblio.jp/content/ All unimodularity

Digression

――The general-purpose solver used this time is a free solver called cbc. It will be faster with a paid solver. ――In practice, an approximate solution is often sufficient. Using the approximate solution solver is a trade-off with the accuracy of the solution, but it will be even faster. ――The data this time is a complete bipartite graph, but in practice the data is often sparse. In that case, it will be even faster because there are fewer variables. ――I could solve the allocation problem with NetworkX, but it was too slow to compare.

Recommended Posts

Comparison of Hungarian law and general-purpose solver for allocation problems
Problems of liars and honesty
Problems of liars and honesty
Comparison of Apex and Lamvery
Comparison of gem, bundler and pip, venv
Comparison of solutions in weight matching problems
Comparison of class inheritance and constructor description
Comparison of L1 regularization and Leaky Relu
Speed comparison of murmurhash3, md5 and sha1