Reputation: 143

Dijkstra's Algorithm: Why is it needed to find minimum-distance element in the queue

I wrote this implementation of Dijksta's Algorithm, which at each iteration of the loop while Q is not empty instead of finding the minimum element of the queue it takes the head of the queue.

Here is the code i wrote

#include <stdio.h>
#include <limits.h>

#define INF INT_MAX
int N;
int Dist[500];
int Q[500];
int Visited[500];
int Graph[500][500];

void Dijkstra(int b){
     int H = 0;
     int T = -1;
     int j,k;

Dist[b] = 0;

Q[T+1] = b;
T = T+1;

while(T>=H){
    j = Q[H];
    Visited[j] = 1;
    for (k = 0;k < N; k++){
        if(!Visited[k] && Dist[k] > Graph[j][k] + Dist[j] && Graph[j][k] != -1){
            Dist[k] = Dist[j]+Graph[j][k];
            Q[T+1] = k;
            T = T+1;
        }
    }

    H = H+1;
}
}  

int main(){

int src,target,m;
int a,w,b,i,j;

scanf("%d%d%d%d",&N,&m,&src,&target);

for(i = 0;i < N;i ++){
    for(j = 0;j < N;j++){
        Graph[i][j] = -1;
    }
}

for(i = 0; i< N; i++){
    Dist[i] = INF;
    Visited[i] = 0;
}


for(i = 0;i < m; i++){
    scanf("%d%d%d",&a,&b,&w);
    a--;
    b--;
    Graph[a][b] = w;
    Graph[b][a] = w;
}

Dijkstra(src-1);


if(Dist[target-1] == INF){
    printf("NO");
}else {
    printf("YES\n%d",Dist[target-1]);
}

return 0;
}

I ran this for all the test cases i ever found and it gave a correct answer.
My question is the why do we need to find the min at all? Can anyone explain this to me in plain english ? Also i need a test case which proves my code wrong.

Upvotes: 14

Answers (4)

melih

Reputation: 33

If you run the algorithm for the following graph it depends on the order of the children. Let's say we are looking for shortest path from 1 to 4.

If you start from the queue with 1,

dist[1] = 0
dist[2] = 21
dist[3] = 0

and seen = {1} while the queue is pushed with 2 and 3 now if we consume 2 from the queue it will make dist[4] = 51,seen={1,2}, q = [1,2,3,4] and next time when 3 is consumed from the queue 2 won't be added to queue again since it is already in seen. Hence the algorithm will later update the distance to 12+31=43 from the path of 1->3-5->4 however the shortest path is 32 and it is on 1->3->2->4.

Let me discuss some other aspects with code examples. Let's say we have a connection list of (u,v,w) where node u has a weighted and directed edge to v with weight w. And let's prepare the graph and edges as below:

graph, edges = {i: set() for i in range(1, N+1)}, dict()
for u,v,w in connection_list:
    graph[u].add(v)
    edges[(u,v)] = w

ALGORITHM1 - Pick any child to add if not visited

q = deque([start])
seen = set()
dist = {i:float('inf') for i in range(1, N+1)}
dist[start] = 0

while q:
    top = q.pop()
    seen.add(top)
    for v in graph[top]:
        dist[v] = min(dist[v], dist[top] + edges[(top, v)])
        if v not in seen:
            q.appendleft(v)

This one is already discussed above and it will give us the incorrect result 43 instead of 32 for the shortest path between 1 and 4.

The problem was not to re-add 2 to the queue, then let's get rid of seen and the children again.

ALGORITHM2 - Add all children to the queue again

while q:
    top = q.pop()
    seen.add(top)
    for v in graph[top]:
        dist[v] = min(dist[v], dist[top] + edges[(top, v)])
        q.appendleft(v)

This will work in that case, but it works only for this example though. Two issues with this algorithm,

We are adding the same nodes again so for a bigger example the complexity will depend on number of edges E instead of number of nodes V and for a dense graph we can assume O(E) = O(N^2).
If we add cycles in the graph it would run forever since there is no check to stop. So this algorithm is not a fit for cyclic graphs.

So that's why we have to spend extra time to pick the minimum child if we do it with a linear search we would end up with the same complexity as above. But if we use a priority queue we can reduce the min search to O(lgN) instead of O(N). Here is the linear search update on the code.

ALGORITHM3 - Dirty Dijkstra's Algorithm with linear minimum search

q = [K]
seen = set()
dist = {i:float('inf') for i in range(1, N+1)}
dist[start] = 0

while q:
    min_dist, top = min((dist[i], i) for i in q)
    q.remove(top)
    seen.add(top)
    for v in graph[top]:
        dist[v] = min(dist[v], dist[top] + edges[(top, v)])
        if v not in seen:
            q.append(v)

Now we know the thought process we can remember to use a heap to have the optimal Dijkstra's algorithm next time.

Upvotes: 2

Dhiraj Yadav

Reputation: 39

It is always mandatory to find out the unvisited vertex with minimum distance else you will get at least one of the edges wrong. For Example, consider the following case

4 4
1 2 8
2 4 5
1 3 2
3 2 1

             (8)   (5)
           1-----2----4
            \   /  
          (2)\ / (1) 
              3

and we start with vertex 1

distance[1]=0

when you have visited vertex 1 you have relaxed vertex 2 and vertex 3 so now

distance[2]=8 and distance[3]=2

after this, if we don't select the minimum and choose vertex 2 instead, we get

distance[4]=13

and then select vertex 3 which will give

distance[2]=3

and hence we end up with distance[4]=13 which should have been

distance[4]=8

hence we should choose minimum from unvisited at each stage of Dijkstra which can be efficiently done using priority_queue.

Upvotes: 3

Ivaylo Strandjev

Reputation: 71009

Take a look at this sample:

1-(6)-> 2 -(7)->3
  \          /
   (7)     (2)
     \    /
       4

I.e. you have edge with length 6 from 1 to 2, edge with length 7 from 2 to 3, edge with length 7 from 1 to 4 and edge from 4 to 3. I believe your algorithm will think shortest path from 1 to 3 has length 13 through 2, while actually best solution is with length 9 through 4.

Hope this make it clear.

EDIT: sorry this example did not brake the code. Have a look at this one:

Your output is Yes 8. While a path 1->7->8->3 takes only 7. Here is a link on ideone

Upvotes: 3

Marcus Johansson

Reputation: 2667

I think your code has the wrong time complexity. Your code compares (almost) all pairs of nodes, which is of quadratic time complexity.

Try adding 10000 nodes with 10000 edges and see if the code can execute within 1 seconds.