[PYTHON] Solving knapsack problems with genetic algorithms and MGG models

What is a genetic algorithm?

A method of obtaining a gene (solution) suitable for the environment (problem to be solved) by repeating generations through gene crossing and mutation, like the evolution of living things.

Untitled Diagram.jpg

What is the knapsack problem?

This problem is tackled by brute force O (2 ^ n) if we simply try to find a solution. Therefore, we are trying to find an approximate solution relatively quickly using a genetic algorithm.

Flow of genetic algorithm

Problem setting

Suppose there are 30 products this time. The first element of the tuple is weight and the second element is value.

production = [(2, 21), (3, 4), (2, 28), (4, 21), (8, 7),
              (10, 22), (10, 22), (10, 22), (7, 2), (5, 40),
              (7, 28), (9, 36), (7, 14), (8, 25), (7, 14),
              (2, 21), (8, 2), (9, 36), (5, 28), (5, 4),
              (4, 12), (8, 7), (7, 28), (2, 28), (7, 28),
              (9, 24), (5, 40), (2, 21), (3, 4), (3, 40),
              (10, 15), (7, 14), (10, 18), (10, 22), (9, 33),
              (7, 2), (3, 40), (4, 12), (9, 36), (7, 35),
              (8, 25), (9, 33), (9, 24), (7, 31), (7, 21),
              (5, 28), (7, 21), (10, 15), (8, 2), (9 ,20),]

First generation generation

For genes, prepare two values, one with each product (= 1) and one without (= 0), and the number of digits for the number of products. In addition, n genes are prepared for each generation. The first generation randomly assigns values and becomes more sophisticated with each generation.

geen = [[random.randint(0, 1) for i in range(dim)] for l in range(n)]

Evaluation

The evaluation of genes varies from problem to problem, and it is important to set them well. For example, for this issue

When the weight of the product you have is severely restricted,

This time,

As a result, if all genes are above the permissible level, we will consider the products in order of lightness.

def f(g):
    weight = sum([production[j][0] * g[j] for j in range(dim)])
    if weight <= w_max:
        return sum([production[j][1] * g[j] for j in range(dim)]), weight
    else:
        return 1, weight

Choice

Select which gene to cross. This time we will use roulette selection. Random number selection is performed using the ratio of the evaluation results of each gene as the weighted probability.

def select(score):
    total = sum(score)
    threshold = random.random() * total
    sum_s = 0
    for i, s in enumerate(score):
        sum_s += s
        if sum_s > threshold:
            return i

Also, incorporate elite selection in parallel with roulette selection. The best genes can be passed on to the next generation. The total weight taken in the evaluation is taken into account here.

def find_elite(score, weight=None):
    if not weight is None and len(list(set(score))) == 1:
        min_weight = 1e+6
        min_index = None
        for i, w in enumerate(weight):
            if min_weight > w:
                min_weight = w
                min_index = i
        return min_index
    else:
        max_score = -1
        max_index = None
        for i, val in enumerate(score):
            if max_score < val:
                max_score = val
                max_index = i
        return max_index

Crossing

Combining two genes to create a new gene. This time, we will use a two-point crossover. A method of randomly specifying two points and switching between the two points. 「000|101|111」 <- 「000|010|111」 「001|101|110」

def cross(parent1, parent2):
    length = len(parent1)
    r1 = int(math.floor(random.random() * length))
    r2 = r1 + int(math.floor(random.random() * (length - r1)))

    child = copy.deepcopy(parent1)
    child[r1:r2] = parent2[r1:r2]

    return child

mutation

Over generations with only crossovers will improve the solution, but may lead to a locally optimal solution. Therefore, we introduce a mutation that rewrites the gene with a low probability.

def mutate(geen):
    for i in range(n):
        for l in range(dim):
            if random.random() > cross_rate:
                geen[i][l] = 1 - geen[i][l]

    return geen

Run

Combine the above and execute.

n = 10
w_max = 50
dim = len(production)
cross_rate = 0.99
g_num = 10000

geen = [[random.randint(0, 1) for i in range(dim)] for l in range(n)]

for stage in range(g_num):
    #Evaluation
    score = []
    weight = []
    for g in geen:
        s, w = f(g)
        score.append(s)
        weight.append(w)

    #Choice
    elite_index = find_elite(score, weight)
    if stage % 100 == 0:
        print("Generation: {}".format(stage))
        print(f(geen[elite_index]), geen[elite_index])

    #Crossing
    next_geen = [geen[elite_index]]
    while len(next_geen) < n:
        selected_index1 = select(score)
        selected_index2 = select(score)
        while selected_index1 == selected_index2:
            selected_index2 = select(score)
        next_geen.append(cross(geen[selected_index1], geen[selected_index2]))

    #mutation
    geen = mutate(next_geen)

The execution result is as follows. The view is (Evaluation result (value) of the best gene, same weight) [Gene]

Execution result


Generation: 0
(1, 102) [0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0]
Generation: 100
(243, 50) [1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
Generation: 200
(358, 47) [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 300
(319, 50) [0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 400
(247, 50) [0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 500
(349, 48) [0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Generation: 600
(394, 50) [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
Generation: 700
(382, 49) [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Generation: 800
(328, 47) [0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Generation: 900
(315, 48) [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Generation: 1000
(333, 49) [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]

MGG model

A derivative model of a genetic algorithm, a method of maintaining genetic diversity by localizing alternation of generations. Untitled Diagram (1).jpg

Run

n = 10
w_max = 50
dim = len(production)
cross_rate = 0.99
g_num = 10000

geen = [[random.randint(0, 1) for i in range(dim)] for l in range(n)]

for stage in range(g_num):
    #Randomly select two
    selected_index1, selected_index2 = random.sample(range(n), 2)
    parent1 = geen[selected_index1]
    parent2 = geen[selected_index2]

    #Multiple generations from the selected two by crossing / mutation
    children = [cross(parent1, parent2) for i in range(n)]
    children = mutate(children)

    #Calculate child score
    score = []
    weight = []
    for c in children:
        s, w = f(c)
        score.append(s)
        weight.append(w)

    #Child elite
    elite_index = find_elite(score, weight=weight)
    if stage % 100 == 0:
        print("Generation: {}".format(stage))
        print(f(geen[elite_index]), geen[elite_index])

    #Child roulette selection
    score[elite_index] = 0
    selected_index = select(score)

    #Return selected child to parent
    geen[selected_index1] = children[elite_index]
    geen[selected_index2] = children[selected_index]

Execution result


Generation: 0
(1, 131) [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1]
Generation: 100
(164, 50) [1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1]
Generation: 200
(303, 48) [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
Generation: 300
(334, 47) [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 400
(369, 50) [0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
Generation: 500
(369, 50) [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
Generation: 600
(375, 49) [1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 700
(366, 50) [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
Generation: 800
(328, 48) [0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Generation: 900
(382, 49) [0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
Generation: 1000
(395, 50) [1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]

Afterword

It can be seen that the results of both are getting better with each generation. Also, the genetic algorithm seems to have waves in the results from generation to generation, but the MGG model seems to have relatively small waves.

Recommended Posts

Solving knapsack problems with genetic algorithms and MGG models
Solving 4-color problems with combinatorial optimization
Study Genetic Algorithms (1)-Basic OneMax Problems
Solving Knapsack Problems with Google's OR-Tools-Practicing the Basics of Combinatorial Optimization Problems
Getting Started with Python Genetic Algorithms
Solving nurse scheduling problems with combinatorial optimization
Solving the traveling salesman problem with the genetic algorithm (GA) and its library (vcopt)
Solving school district organization problems with combinatorial optimization
Solving the Lorenz 96 model with Julia and Python
Solving the Python knapsack problem with the greedy algorithm