[PYTHON] Dijkstra algorithm for beginners

Isn't there a Dijkstra algorithm? It's hard to remember the name, and it's unlikely that you'll implement it in practice, such as the shortest path problem of weighted graphs, so you'll forget it soon. .. ..

What is Dijkstra's algorithm?

Dijkstra's algorithm is an algorithm that finds the shortest path of a graph. I think many people have heard only the name. It's easy and simple to do, but it's a little difficult to understand, but it's a relatively easy-to-understand algorithm if you understand each one. Breadth-first search can be used to solve unweighted, maze searches, etc., but if each side is weighted, you will have to calculate the entire route after all. Assuming that all vertices are passed only once, if there are E sides, it will be O (E!) And the amount of calculation will explode. It's a little tough to calculate this.

The Dijkstra algorithm is an algorithm that efficiently solves such problems.

By the way, the cost of each side must be a non-negative value (0 or more). If a negative number is included, the Bellman-Ford method etc. will be used.

procedure

The Dijkstra procedure is fairly simple.

  1. Set the shortest distance 0 as the starting point
  2. Move to the shortest distance known and the shortest distance from the points that have not been traced yet
  3. Set the shortest distance between the vertices connected to that vertex. At this time, if the shortest distance of the vertex can be updated, update it.
  4. Do this until you know the shortest distance of all vertices

Well, I think it's a little difficult to say, so let's see an example.

Example

It's an analog method, but I was able to express it the most. .. .. Consider the shortest path in the figure below. image.png

Green is the top of the movement with the shortest path determined. Red is the starting point. The number at each vertex is the shortest path at that time dijkstra algo.gif

As you can see, you can see that the shortest path of the adjacent vertices is calculated from the starting point by moving to the shortest vertex at that time in order.

Implementation

Now let's get into the implementation. Let's implement the above procedure in a straightforward manner.

function main(nodes) {
    const start = nodes[0]
    //Record visited vertices
    const visited = new Set()
    const routesFromStart = new Map()
     //Record the distance from the starting point

    routesFromStart.set(start, {distance: 0})
    for(const n of nodes) {
        if(n != start) {
            //Substitute infinity for all vertices except the start
            routesFromStart.set(n, {distance: Number.MAX_VALUE})
        }
    }
    let current = start
    let routes = new Map()
    while(current != null) {
        visited.add(current)
        for(const edge of current.edges) {
             //Calculates the shortest distance from that vertex to adjacent vertices and updates if it is lower than the calculated value
            if(edge.cost + routesFromStart.get(current).distance < routesFromStart.get(edge.to).distance) {
                routesFromStart.set(edge.to, {distance: edge.cost + routesFromStart.get(current).distance})
                routes.set(current, edge.to)
            }
        }
        let cheapestNodeDistance = Number.MAX_VALUE
        current = null
        //Select the smallest vertex from the calculated vertices for the shortest unvisited distance

        for(const city of routesFromStart.keys()) {
            if(!visited.has(city) && cheapestNodeDistance > routesFromStart.get(city).distance){
                cheapestNodeDistance = routesFromStart.get(city).distance
                current = city
            }
        }
    }
    return routesFromStart.get(nodes[nodes.length - 1]).distance
}

Assuming that each vertex is V, this code goes around the loop for selecting the smallest vertex in the loop that goes around the number of each side at the maximum, so the amount of calculation is O (V ^ 2 + E) for each memory. It is O (V) because it is necessary to record the vertices.

Implementation using Priority Queue

Well, as you may have noticed, this code actually allows you to optimize the logic that determines the smallest vertices. That is the priority queue. Priority queues require O (logN) complexity to insert and retrieve, but the complexity of determining the smallest vertices is faster than a linear search.

Priority Queue is not implemented as standard in JavaScript, so it is implemented in Python.


def dijkstra(nodes):
    start_node = nodes[0]
    routes_from_start = {n: math.inf for n in nodes}

    #Set zero to first vertex
    routes_from_start[start_node] = 0
    
    minHeap = []

    #Add first vertex
    heappush(minHeap, (0, start_node))

    #Search until heap runs out
    while minHeap:
        (cost, current_node) = heappop(minHeap)

        #Since the priority key is duplicated, check here
        if cost > routes_from_start[current_node]:
            continue

        for node in current_node.routes:
            price_info = current_node.routes[node]
            if routes_from_start[node] > price_info + routes_from_start[current_node]:
                routes_from_start[node] = price_info + routes_from_start[current_node]
                #Add value to priority when updated
                heappush(minHeap, (price_info + routes_from_start[current_node], node))

    return routes_from_start[nodes[-1]]

Let's take another opportunity to explain Priority Queue. If the calculation efficiency is better and each vertex is V and each side is V, heap is operated for the number of times of O (V) and E that sets V to map, so O (ElogE). You can see that it can be calculated by the total O (V + ElogE) of. This is more efficient than the first algorithm.

Memorize the route

Now I know the shortest cost. But this problem is the "shortest path" problem. When you find the shortest cost, you usually want to know the route. Let's improve the above code.

def dijkstra(nodes):
    start_node = nodes[0]
    routes_from_start = {n: math.inf for n in nodes}

    #Set zero to first vertex
    routes_from_start[start_node] = 0
    
    minHeap = []


    #Add first vertex
    heappush(minHeap, (0, start_node))
    path = collections.defaultdict(Node)

    #Search until heap runs out
    while minHeap:
        (cost, current_node) = heappop(minHeap)

        #Since the priority key is duplicated, check here
        if cost > routes_from_start[current_node]:
            continue

        for node in current_node.routes:
            price_info = current_node.routes[node]
            if routes_from_start[node] > price_info + routes_from_start[current_node]:
                routes_from_start[node] = price_info + routes_from_start[current_node]
                #Record the node that updates the shortest distance
                path[node.id] = current_node.id
                #Add value to priority when updated
                heappush(minHeap, (price_info + routes_from_start[current_node], node))

    current_node = nodes[-1].id
    path_array = []

    #Follow the node that recorded the shortest distance from the goal
    while current_node:
        path_array.append(current_node)
        if current_node not in path:
            break
        current_node = path[current_node]

    return routes_from_start[nodes[-1]], path_array[::-1]

The Dijkstra algorithm knows which node updates the shortest distance, so you can record it and follow it last. The amount of calculation will increase by the number of nodes with the shortest distance.

By the way, why is this the shortest path?

By the way, I think many people have thought this way after looking at it so far. Sure, the algorithm is simple and it's not too difficult to implement. But why can you find the shortest distance? Let's check it lightly

image.png

Assuming that the vertex in L is the shortest distance from the start S, it would be nice to say that the shortest vertex connected from there is also the shortest distance from S.

image.png

image.png

Now, we will move to the shortest vertex included in T, so if the smallest point is i, then d [i] = min (T), isn't it? Now, assuming that each vertex is k, it is certain that the shortest distance d [k] is d [k]> = d [i]. Because d [i] is the smallest point and each vertex is non-negative. You can prove it inductively by doing it one after another.

Well, if you think about it, it's a recurrence formula.

The shortest distance between the vertices of L adjacent to d [i] = min (k ⊂ T) + i

Dynamic programming is used for recurrence formulas

The recurrence formula is DP, isn't it? This article is very helpful for DP (https://qiita.com/drken/items/a5e6fe22863b7992efdb)

Then, how will the value be updated in DP? image.png

The value will be updated like this. The vertical axis is the number of trials. The horizontal axis is the apex.

What the Dijkstra algorithm was a kind of DP.

Summary

The Dijkstra algorithm I've tried, but once you understand it, it's fairly easy to understand. After that, when I tried to implement it and encountered a similar problem due to an algorithm problem, this was that time! !! I want to solve it like that.

Click here for the explained youtube channel https://youtu.be/jz8b0q5R1Ss

reference

http://www.lab2.kuis.kyoto-u.ac.jp/~shuichi/algintro/alg-6s.pdf https://www.youtube.com/watch?v=X1AsMlJdiok

Recommended Posts

Dijkstra algorithm for beginners
Roadmap for beginners
Spacemacs settings (for beginners)
python textbook for beginners
OpenCV for Python beginners
[For beginners] kaggle exercise (merucari)
Linux distribution recommended for beginners
CNN (1) for image classification (for beginners)
Python3 environment construction (for beginners)
Overview of Docker (for beginners)
Seaborn basics for beginners ④ pairplot
Basic Python grammar for beginners
100 Pandas knocks for Python beginners
Python for super beginners Python #functions 1
Python #list for super beginners
~ Tips for beginners to Python ③ ~
[For Kaggle beginners] Titanic (LightGBM)
Reference resource summary (for beginners)
Linux command memorandum [for beginners]
Convenient Linux shortcuts (for beginners)
[For beginners] Re: Genetic algorithm starting from zero [Artificial intelligence]
[Explanation for beginners] TensorFlow tutorial MNIST (for beginners)
TensorFlow MNIST For ML Beginners Translation
Decision tree (for beginners) -Code edition-
Pandas basics for beginners ⑧ Digit processing
Python Exercise for Beginners # 2 [for Statement / While Statement]
Seaborn basics for beginners ② Histogram (distplot)
[For beginners] Django -Development environment construction-
[For beginners] Script within 10 lines (1.folium)
Python #index for super beginners, slices
<For beginners> python library <For machine learning>
TensorFlow Tutorial MNIST For ML Beginners
Frequently used Linux commands (for beginners)
[Must-see for beginners] Basics of Linux
Python #len function for super beginners
Beginners use Python for web scraping (1)
Run unittests in Python (for beginners)
What is xg boost (1) (for beginners)
Algorithm for finding collusion spam reviewers
Beginners use Python for web scraping (4) ―― 1
Python #Hello World for super beginners
Linear regression (for beginners) -Code edition-
Python for super beginners Python # dictionary type 2 for super beginners
Pandas basics summary link for beginners
[For beginners] Process monitoring using cron
LSTM (1) for time series forecasting (for beginners)
[Deprecated] Chainer v1.24.0 Tutorial for beginners
TensorFlow Tutorial -MNIST For ML Beginners
Ridge Regression (for beginners) -Code Edition-
[Explanation for beginners] TensorFlow tutorial Deep MNIST
INSERT into MySQL with Python [For beginners]
[Kaggle for super beginners] Titanic (Logistic regression)
[Python] Minutes of study meeting for beginners (7/15)
First Steps for Machine Learning (AI) Beginners
[Linux command summary] Command list [Must-see for beginners]
Let's put together Python for super beginners
Django tutorial summary for beginners by beginners ③ (View)
Implementation and explanation using XGBoost for beginners
Linux operation for beginners Basic command summary
Time series data anomaly detection for beginners
Supplementary notes for TensorFlow MNIST For ML Beginners