Performance improvement for Dijkstra algorithm using heaps in python?

Question

Below is my implementation for Dijkstra's algorithm using heaps (for undirected graphs).

This works just fine for reasonably sized graphs however I am not satisfied by my code for recalculating greedy criterion of nodes connected to the newly explored node.

Here I iterate over all of the items in the heap which doesn't look like a good use of the data structure. Is there a better way to achieve this ?

The graph is a dictionary which contains for each node (key) a list of nodes connected to it and the length of the edge. graph[v] returns [[w_1, l_1],...,[w_k, l_k]] where there is an edge between v and w_i of length l_i for all is between 1 and k. Sort of like an adjacency list.

shortestPath = {1: 0}  # Length of the shortest pass from s (labelled 1) to v (v's label = key)


def createHeap():
    # Heap contains the greedy criterion for an edge and the label of it's head vertex
    heap = []
    for k, v in graph.items():
        if k in shortestPath:
            continue  # Ignoring starting vertex
        bestScore = 1000000
        for (w, length) in v:
            if w in shortestPath:
                greedyScore = shortestPath[w] + length
                if greedyScore < bestScore:
                    bestScore = greedyScore
        hq.heappush(heap, (bestScore, k))  # bestScore used as heap key
    return heap


def dijkstra():
    heap = createHeap()
    for _ in range(n - 1):  # For every vertex in graph (other than starting vertex 1)
        greedyScore, w = hq.heappop(heap)  # Next vertex to be processed
        shortestPath[w] = greedyScore
        for [v, length] in graph[w]:  # Finds the new crossing edges and re-evaluates greedy criterion
            if v not in shortestPath:
                for item in heap:  # better implementation ?
                    if item[1] == v:
                        oldScore = item[0]
                        heap.remove(item)
                        hq.heapify(heap)
                        newScore = min(oldScore, greedyScore + length)
                        hq.heappush(heap, (newScore, v))

dijsktra()

trincot · Accepted Answer

There are several alternative implementations of Dijkstra's algorithm, but none of them need to iterate through the heap, nor remove arbitrary nodes from it, nor re-heapify it repeatedly. All those actions kill the performance benefit that a heap is intended to give.

I would also advise to avoid global variables. Instead pass what is needed as arguments, and let the function dijkstra return the results.

Here is one of the possible implementations you could use for your graph data structure:

def dijkstra(graph, start):
    distances = {}
    heap = [(0, start)]

    while heap:
        dist, node = hq.heappop(heap)
        if node in distances:
            continue  # Already encountered before
        # We know that this is the first time we encounter node.
        #   As we pull nodes in order of increasing distance, this 
        #   must be the node's shortest distance from the start node.
        distances[node] = dist
        for neighbor, weight in graph[node]:
            if neighbor not in distances:
                hq.heappush(heap, (dist + weight, neighbor))

    return distances

Example call:

graph = {
    "a": (("b", 8), ("c", 3), ("d", 6)),
    "b": (("a", 8), ("e", 5)),
    "c": (("a", 3), ("d", 2), ),
    "d": (("a", 6), ("c", 2), ("e", 5), ("g", 3)),
    "e": (("b", 5), ("d", 5), ("f", 5)),
    "f": (("e", 5), ("g", 3), ("h", 6)),
    "g": (("d", 3), ("f", 3), ("h", 4)),
    "h": (("g", 4), ("f", 6))
}

distances = dijkstra(graph, "a")

print(distances)

Output:

{'a': 0, 'c': 3, 'd': 5, 'b': 8, 'g': 8, 'e': 10, 'f': 11, 'h': 12}

Performance improvement for Dijkstra algorithm using heaps in python?

Answers (1)

Related Questions