Reputation: 368
I have found two diferent aproaches to Dijkstra's priority queue.
Shouldn't the time complexities be different?
The implementation you usually find for Dijkstra's shortest path starts filling the priority queue with ALL vertices:
(From wikipedia's pseudocode:)
for each vertex v in Graph:
dist[v] ← INFINITY
prev[v] ← UNDEFINED
add v to Q
dist[source] ← 0
[...]
However, as said in The Big O on the Dijkstra Fibonacci-heap solution, the complexity of Dijkstra's shortest path algorithm is:
O(|E| |decrease-key(Q)| + |V| |extract-min(Q)|)
Using a binary heap as priority queue that is equal to: O((E + V)log(Q))
as decrease-key(Q) and extract-min(Q) are both O(log(|Q|))
Or if you fill the queue with all vertices: O((E+V)log(V))
The implementation found in python's networkx package, however, doesn't fill the priority queue with all vertices. It starts only with the source vertex and continues to insert other vertices as they are discovered. Something like:
Q = priority queue
seen = provisional distances to every 'seen' vertex
add source to Q
seen[source] ← 0
while Q not empty:
v = pop(Q)
if v == target -> path found!
for neig in neighbours of v:
if neig not in seen:
add neig to Q
...
[normal Dijksta algorithm]
This way the priority queue never even comes close to having |V| elements (in a relatively sparce graph at least). It will have at any given point all vertices in the boundary between explored and unseen vertices. Comparing with the "normal" implementation of the queue (Q1) this queue's (Q2) size will be: |Q2| = |Q1 where cost != inf|
Wouldn't that mean the time complexity of this implementation is lower than O((E+V)log(V))
?
Am I missing something?
I have tested it in a road network graph and indeed it takes less time to compute a shortest path this way. Even excluding the time spent in filling the queue initialy in the "normal" case.
Upvotes: 1
Views: 351
Reputation: 35560
Yes, this algorithm is faster in terms of actual runtime. But, complexity-wise, it's the same. Why? We can first look at an extreme: a graph with one node (the starting node), connected to a large number of nodes (let's say 100), and each one of those is connected to one other node (the ending node). When traversing this graph, you will add all 100 of those nodes to the priority queue at any given time, which is almost all the nodes in the graph. However, this extreme isn't very realistic. However, we can look at a more realistic case: a completely filled binary tree. When you're nearing the end, you'll have 1/4 or 1/2 of the nodes in the priority queue (since the last row has half the elements). Even if you only had sqrt(n) nodes in the priority queue, log(sqrt(n)) is just 0.5*log(n) due to logarithmic properties. So, you only get a constant factor change, not an asymptotic one.
Upvotes: 2
Reputation: 65498
The log function grows slowly. Even with a road network, where you might have on the order of n1/3 to n1/2 nodes on the frontier, the log factor is still at least log n1/3 = (1/3) log n = Θ(log n), so you're not seeing an asymptotic savings.
For road networks specifically, there are improvements that are way more consequential, e.g., reach and contraction hierarchies.
Upvotes: 2