I tried "Implementing a genetic algorithm (GA) in python to solve the traveling salesman problem (TSP)"

Introduction

Since this is Qiita's first post, please give us all the details such as not only the content but also the format and writing style.

[Implementing the genetic algorithm (GA) in python to solve the traveling salesman problem (TSP)](http://samuiui.com/2019/10/27/ Implementing the genetic algorithm (ga) in python Patrol /)

I actually tried. I thought that if I got used to coding, I wouldn't have to mention it one by one. I will record what I am interested in as much as possible.

[Here]((http://samuiui.com/2019/10/27/python implements genetic algorithm (ga) and patrols /)) is used as a reference (& quote).

Suddenly

Even if you copy and paste the code of [link above]((http://samuiui.com/2019/10/27/python implements genetic algorithm (ga) and patrols /)) Sometimes it didn't work. That is natural, for example

Things like this
import numpy as np
import random
import matplotlib.pyplot as plt
I have to write. Of course, I shouldn't mention such a rudimentary ... There was a misspelling ... maybe I'm doing it too ...

Let's get into the main subject.

What is a genetic algorithm (GA)?

Prosperous offspring of those with excellent genes. In the genetic algorithm, a good solution is prospered in the same way.

Descendants prosperity

A dominant individual for every element gives birth to many offspring, Inferior individuals can give birth to only a few offspring. The same thing happens with the movement from child generation to grandchild generation. By repeating this process, sophisticated children who are adapted to the environment are born.

mutation

As the crossing continues, only similar individuals will be found. Our genes ensure diversity by causing occasional mutations.

What is the Traveling Salesman Problem (TSP)?

The traveling salesman problem is It's a kind of problem that a salesman should go around n business destinations in what order. Theoretically, there are n! Street routes, so brute force is difficult.

This time I would like to solve GA using this TSP.

flow

[Source]((http://samuiui.com/2019/10/27/Implementing a genetic algorithm (ga) in python and patrols /)) flow_ga.jpg

Initial individual generation

As an early individual, you need to create a city and a gene.

city

Creates n cities with both x and y coordinates in the range 0-1. Input is the number of cities, output is the matrix of (number of cities) x 2 Implementation example: generate_rand_cities

#Generate a city
def generate_rand_cities(num_cities):
    positions = np.zeros((num_cities, 2))
    for i in range(num_cities):
        positions[i, 0] = random.random()
        positions[i, 1] = random.random()
    return positions
#As a test
generate_rand_cities(10)

About random (random.random)

random.random()

Randomly generates floating point numbers between 0.0 and 1.0

gene

Give the gene information about the order in which it travels through cities. The length of the gene matches the number of cities. For example, when the number of cities is 6, [3, 5, 1, 0, 4, 2] [0, 4, 2, 1, 5, 3] Such individuals are possible.

Because genes make multiple individuals in one generation Genes are represented by the matrix (number of genes) x (number of cities).

The input is the number of cities and the number of individuals in one generation, and the output is the matrix of (number of genes) x (number of cities).

Implementation example: generate_init_genes

#Gene generation
def generate_init_genes(num_individual, num_cities):
    genes = np.zeros((num_individual, num_cities), dtype=np.int16)
    for i in range(num_individual):
        genes[i,] = random.sample(range(num_cities), k=num_cities)
    return genes
#As a test
generate_init_genes(15,10)

About random (random.sample)

Random.sample is taken out randomly without duplication. For example

l = [0, 1, 2, 3, 4]
random.sample(l, 3)

Will output something like [4, 2, 1] or [1, 0, 3].

Evaluation

In this task, genes showing shorter pathways need to be highly evaluated. Therefore, first, the length of the pathway corresponding to the gene is obtained. After that, it is compared and evaluated within the generation.

Pathways corresponding to genes

The reference source calculated the distance between cities for each gene. Make a table about the distance between cities and refer to it according to the order of genes.

Distance table between cities

Implementation example

#Table about distances between cities
def generate_dist_table(positions):
    path_table = np.zeros((len(positions), len(positions)))
    for i in range(len(positions)):
        for j in range(len(positions)):
            path_table[i,j] = np.linalg.norm(positions[i]-positions[j])
    return path_table

#As a test
num_cities = 4
positions = generate_rand_cities(4)
generate_dist_table(positions)

The distance between two cities is calculated twice. I can still scrape here ...

The length of the pathway corresponding to the gene

Based on this table, the length of the pathway corresponding to the gene is obtained.


#Refer to the table to find the sum of the pathways corresponding to the genes.
def sum_path2(path_table, gene):
    sum = 0.
    for i in range(len(gene)-1):
        sum += path_table[int(gene[i]),int(gene[i+1])]
    return sum

#As a test
cities = generate_rand_cities(10)
genes = generate_init_genes(15,10)
dist_table = generate_dist_table(cities)
for i in range(len(genes)):
    print(sum_path2(dist_table, genes[i]))

In order to evaluate genes collectively for each generation, set a function that outputs genes for one generation collectively.

#One-generation calculation for the sum of routes
def genes_path2(path_table, genes):
    pathlength_vec = np.zeros(len(genes))
    for i in range(len(genes)):
        pathlength_vec[i] = sum_path2(path_table, genes[i])
    return pathlength_vec

#As a test
cities = generate_rand_cities(10)
genes = generate_init_genes(15,10)
dist_table = generate_dist_table(cities)
genes_path2(dist_table, genes)

roulette

The length of the pathway corresponding to the gene was found. This time, the shorter the route, the higher the evaluation. The shorter the route, the higher the probability of being selected as the one that leaves offspring. Perform normalization.

#roulette
def generate_roulette(fitness_vec):
    total = np.sum(fitness_vec)
    roulette = np.zeros(len(fitness_vec))
    for i in range(len(fitness_vec)):
        roulette[i] = fitness_vec[i]/total
    return roulette

#As a test
fitness = np.array([20,50,30])
generate_roulette(fitness)

On the contrary

array([0.2, 0.5, 0.3])

Will be.

Evaluate genes

The length of the pathway corresponding to the gene is already known. This time, the shorter the route, the higher the evaluation. The shorter the route, the higher the probability of being selected as the one that leaves offspring. Therefore, this time, the reciprocal of the path length is used.

#Roulette execution example
cities = generate_rand_cities(20)
genes = generate_init_genes(21, 20)
dist_table = generate_dist_table(cities)
paths = genes_path2(dist_table, genes)
inverse_path = 1/paths
print("path length: "+str(paths))
print("roulette table: "+str(generate_roulette(inverse_path)))

Since the total of the roulette table is 1, it can be interpreted as the probability that the corresponding gene will be selected. This is used to select genes.

Selection (selection)

#Select individuals to cross based on roulette
def roulette_choice(fitness_vec):
    roulette = generate_roulette(fitness_vec)
    choiced = np.random.choice(len(roulette), 2, replace=True, p=roulette)
    return choiced

#As a test
cities = generate_rand_cities(20)
genes = generate_init_genes(21, 20)
dist_table = generate_dist_table(cities)
fitness_vec = 1 / genes_path2(dist_table, genes)
roulette_choice(fitness_vec)

Create a roulette table with generate_roulette. Select two genes from the roulette based on the roulette table.

About random (random.choice)

random.choice has duplicates and randomly selects a value. The range of numbers selected by the first argument, The second argument is the number to choose, Specify whether there is a duplicate with replace (True with duplicate) The probability that each is selected is specified by p.

Crossing

It is now possible to select two relatively excellent individuals using roulette_choice. Multiply the two individuals. There are multiple crossover methods depending on the shape of the gene and also for the same gene shape. For example, there seems to be order crossing and circular crossing → Application of GA to order problems This time we performed a partial crossover. Please refer to the image below for the method of partial crossing. [Source]((http://samuiui.com/2019/10/27/Implementing a genetic algorithm (ga) in python and patrols /)) image.png image.png image.png image.png

Implementation of partial crossover

#Implementation of partial crossover
def partial_crossover(parent1, parent2):
    num = len(parent1)
    cross_point = random.randrange(1,num-1)   #Break
    child1 = parent1
    child2 = parent2
    for i in range(num - cross_point):
        #Value just to the right of the break
        target_index = cross_point + i
        target_value1 = parent1[target_index]
        target_value2 = parent2[target_index]
        exchange_index1 = np.where(parent1 == target_value2)
        exchange_index2 = np.where(parent2 == target_value1)
        #Exchange
        child1[target_index] = target_value2
        child2[target_index] = target_value1
        child1[exchange_index1] = target_value1
        child2[exchange_index2] = target_value2
    return child1, child2

#As a test
genes = generate_init_genes(2, 10)
print("parent1: "+str(genes[0]))
print("parent2: "+str(genes[1]))
child = partial_crossover(genes[0], genes[1])
print("child1:  "+str(child[0]))
print("child2:  "+str(child[1]))

mutation

There are various methods for mutation. This time we will do what is called a translocation. Swap two randomly selected places.

def translocation_mutation2(genes, p_value):
    mutated_genes = genes
    for i in range(len(genes)):
        mutation_flg = np.random.choice(2, 1, p = [1-p_value, p_value])
        if mutation_flg == 1:
            mutation_value = np.random.choice(genes[i], 2, replace = False)
            mutation_position1 = np.where(genes[i] == mutation_value[0])
            mutation_position2 = np.where(genes[i] == mutation_value[1])
            mutated_genes[i][mutation_position1] = mutation_value[1]
            mutated_genes[i][mutation_position2] = mutation_value[0]
    return mutated_genes

#As a test
genes = generate_init_genes(5, 10)
print("before")
print(genes)
print("after")
print(translocation_mutation2(genes, 0.7))

mutation_flag has a probability of p_value of 1. Therefore, the gene mutates with a probability of p_value.

Visualization

Although it is not written in the flowchart, it is visualized using matplotlib to check the transition of GA.

City location

#Visualization of city location
def show_cities(cities):
    for i in range(len(cities)):
        plt.scatter(cities[i][0], cities[i][1])

#As a test
cities = generate_rand_cities(10)
show_cities(cities)

Route visualization

def show_route(cities, gene):
    for i in range(len(gene)-1):
        if i == 0:
            plt.text(cities[int(gene[i])][0], cities[int(gene[i])][1], "start")
        else:
            plt.text(cities[int(gene[i])][0], cities[int(gene[i])][1], str(i))
        plt.plot([cities[int(gene[i])][0], cities[int(gene[i+1])][0]], 
                 [cities[int(gene[i])][1], cities[int(gene[i+1])][1]])
    plt.text(cities[int(gene[i+1])][0], cities[int(gene[i+1])][1], "goal")

#As a test
cities = generate_rand_cities(10)
show_cities(cities)
genes = generate_init_genes(10,10)
show_route(cities,genes[0])

Integration

Implement a program that solves TSP with GA by using the functions created so far.

Parameters

#Parameters
num_cities = 10
individuals = 21
generation = 10000
elite = 9
p_mutation = 0.01

Initialize

#Initialize
cities = generate_rand_cities(num_cities)
genes = generate_init_genes(individuals, num_cities)
dist_table = generate_dist_table(cities)
show_cities(cities)
show_route(cities, genes[0])
plt.show()

You should be able to see this download.png

Execution department

With excellent elite of the parent generation Individual-elite children born by crossing between parent generations are responsible for the child generation. Mutations occur in the offspring to form the final offspring.

#Execution department
top_individual=[]  #List of best fitness for each generation
max_fit = 0  #Highest fitness ever
fitness_vec = np.reciprocal(genes_path2(dist_table, genes))
for i in range(generation):
    children = np.zeros(np.shape(genes))
    for j in range(int((individuals-elite)/2+1)):
        parents_indices = roulette_choice(fitness_vec)
        children[2*j], children[2*j+1] = partial_crossover(genes[parents_indices[0]], genes[parents_indices[1]])

    for j in range(individuals-elite, individuals):
        children[j] = genes[np.argsort(genes_path2(dist_table, genes))[j-individuals+elite]]

    children = translocation_mutation2(children, p_mutation)
    top_individual.append(max(fitness_vec))
    genes = children
    fitness_vec = np.reciprocal(genes_path2(dist_table, genes))
    if max(fitness_vec) > max_fit:  #After recording the highest fitness ever, display the route
        max_fit = max(fitness_vec)
        show_cities(cities)
        show_route(cities, genes[np.argmax(fitness_vec)])
        plt.text(0.05, 0.0, "generation: " + str(i) + "  distance: " +str(sum_path2(dist_table,genes[np.argmax(fitness_vec)])))
        plt.show()

top_individual is a list of the best fitness of each generation.

The last one displayed download.png It will be the one that goes around in the order that the route becomes shorter like. Also,

plt.plot(top_individual)

You can check the transition of fitness as the generation changes.

About numpy (memo)

np.reciprocal () → Gives the reciprocal of each element np.argmax () → Gives the index of the largest of each element np.argsort () → Index order when each element is sorted

Summary

As a default setting Number of cities, Gene population, Number of evolutionary generations, Number of elite, Mutation probability, Was given.

Obviously there is a better route if you increase the number of cities, but it doesn't evolve into that route. Also, it takes time to calculate There is still room for improvement. As a possible point

How to cross

I don't know anything because I don't know other ways of crossing. There may be a better crossing method. Also, I want to connect it with mutation.

Mutation probability

It is thought that efficiency will be improved if the probability of mutation changes depending on the similarity of parents.

Number of elites picked up

This time I decided on the initial settings. However, as with the probability of mutation, it may be better to change this according to the similarity of the parents. As a test, if the number of elites is reduced, the fitness does not increase and it vibrates. It is thought that fitness will not increase in the early stages because bad elements will be picked up by crossing.

Number of generations

The number of generations was also given by default, but there is room for improvement. The number of generations is directly related to the burden of calculation. If the end condition is set from the rate of change in fitness, it will not be necessary to specify the number of generations.

Gene population

Similarly, the number of gene individuals is involved in the calculation. If the number of individuals is small, it will be necessary to increase the generation of evolution. If you set it in a suitable ratio for the number of cities given, you will not need to enter it.

Outlook

In the end, just enter the city and it will guide you to the right route. Therefore, it seems necessary to investigate the crossover method, how to give the probability of mutation, and suitable values for the number of individuals and generations of genes. I will do my best.

random summary

-Randomly select one element: random.choice () -Randomly select multiple elements (no duplication): random.sample () -Randomly select multiple elements (with duplicates): random.choices () -Randomly generate floating point numbers from 0 to 1: random.random () From Using random with python

Recommended Posts

I tried "Implementing a genetic algorithm (GA) in python to solve the traveling salesman problem (TSP)"
Try to solve the traveling salesman problem with a genetic algorithm (Python code)
Python: I tried the traveling salesman problem
I tried to implement the traveling salesman problem
I tried to solve the problem with Python Vol.1
I wanted to solve the ABC164 A ~ D problem with Python
The story of the algorithm drawing a ridiculous conclusion when trying to solve the traveling salesman problem properly
Solving the traveling salesman problem with the genetic algorithm (GA) and its library (vcopt)
Solve the Python asymmetric traveling salesman problem with a branch and bound method
I tried to graph the packages installed in Python
I tried to solve the soma cube with python
The 15th offline real-time I tried to solve the problem of how to write with python
I tried the super-resolution algorithm "PULSE" in a Windows environment
I tried to implement a one-dimensional cellular automaton in Python
Finding a solution to the N-Queen problem with a genetic algorithm (2)
I tried to solve a combination optimization problem with Qiskit
I tried "How to get a method decorated in Python"
I tried to implement the mail sending function in Python
I tried to make a stopwatch using tkinter in python
Finding a solution to the N-Queen problem with a genetic algorithm (1)
Implementing a simple algorithm in Python 2
How to write offline real time I tried to solve the problem of F02 with Python
I tried to create a Python script to get the value of a cell in Microsoft Excel
I also tried to imitate the function monad and State monad with a generator in Python
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried to solve the ant book beginner's edition with python
I tried to solve the shift scheduling problem by various methods
Solve the subset sum problem with a full search in Python
I tried to find out the difference between A + = B and A = A + B in Python, so make a note
A story that didn't work when I tried to log in with the Python requests module
I tried to implement PLSA in Python
I tried paiza's campaign problem. (Traveling salesman problem)
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
I tried to implement ADALINE in Python
I wanted to solve ABC159 in Python
I tried to implement PPO in Python
Solve the traveling salesman problem with OR-Tools
Solve the maximum subarray problem in Python
I tried to solve TSP with QAOA
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I tried to develop a Formatter that outputs Python logs in JSON
I tried to display the altitude value of DTM in a graph
I tried to implement a card game of playing cards in Python
Try to calculate a statistical problem in Python
Try to solve the Python class inheritance problem
I want to create a window in Python
I tried playing a typing game in Python
I tried simulating the "birthday paradox" in Python
I tried the least squares method in Python
I tried to implement TOPIC MODEL in Python
I tried adding a Python3 module in C
I want to display the progress in Python!
A solution to the problem that the Python version in Conda cannot be changed
I tried to create a class that can easily serialize Json in Python
I searched for the skills needed to become a web engineer in Python
[Python] I tried to get the type name as a string from the type function
I tried to implement what seems to be a Windows snipping tool in Python
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
Introduction to AI creation with Python! Part 2 I tried to predict the house price in Boston with a neural network
[Keras] I tried to solve a donut-type region classification problem by machine learning [Study]