jason-lin
jason-lin

Reputation: 17

There is an unpredictable error in the mpi program-The main process runs twice

#include <stdio.h>
#include "mpi.h"

int main(int argc, char *argv[])
{
    int rank, value, size,count;
    MPI_Status status;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
    MPI_Comm_size(MPI_COMM_WORLD, &size); 
    count=2*size-1;
    while(count>0){
        if (rank==0) {
            // fprintf(stderr, "\nPlease give new value=");
            printf("please input value= ");
            scanf("%d",&value);
            // fprintf(stderr, "%d read <-<- (%d)\n",rank,value);
            printf("%d read <-<- (%d)\n",rank,value);
            count-=1;
    
            if (size>1) {
                MPI_Send(&value, 1, MPI_INT, rank+1, 0, MPI_COMM_WORLD);
                // fprintf(stderr, "%d send (%d)->-> %d\n", rank,value,rank+1);
                printf("%d send (%d)->-> %d\n",rank,value,rank+1);
                count-=1;
            }
        }
        else {
            MPI_Recv(&value, 1, MPI_INT, rank-1, 0, MPI_COMM_WORLD, &status);
            // fprintf(stderr, "%d receive(%d)<-<- %d\n",rank, value, rank-1);
            printf("%d receive(%d)<-<- %d\n",rank, value, rank-1);
            count-=1;
            if (rank<size-1) {
                MPI_Send(&value, 1, MPI_INT, rank+1, 0, MPI_COMM_WORLD);
                fprintf(stderr, "%d send (%d)->-> %d\n", rank, value, rank+1);
            count-=1;
            }
        }
        MPI_Barrier(MPI_COMM_WORLD);
    }
    MPI_Finalize();
}

The function of this program is to pass numbers between processes. Now I open two processes that pass the number 4

enter image description here

But the 0 process ran twice, which is not as expected.

Then I used gdb to debug

enter image description here

This has been bothering me for a long time, and I'm not very good at watching variables from the command line. Please help me.

Upvotes: 1

Views: 177

Answers (1)

dreamcrash
dreamcrash

Reputation: 51443

TL;DR : It runs twice because the while loop is executed two times.

But the 0 process ran twice, which is not as expected.

You have the impression that the process 0 runs twice because the count variable right before entering the while loop has the value of 3 from count=2*size-1; (size is 2 because you are running with 2 processes).

In your loop:

while(count>0){
        if (rank==0) {
            ...
            count-=1;
    
            if (size>1) {
                ...
                count-=1;
            }
        }
        else {
            ...
            count-=1;
            if (rank<size-1) {
               ...
            count-=1;
            }
        }
        MPI_Barrier(MPI_COMM_WORLD);
    }

The count variable is decremented twice for the (process 0), so count is 1, and since the while loop condition is count>0 it is again executed before exiting. Thus, process 0 "runs again".

The process 0 decrements the count variable twice whereas the process 1 only once, so most likely it is a bug. You can run into situations where the process 1 blocks waiting to receive a message from the process 0, but process 0 is already outside the loop.

To test the send and receive of messages from the process 0 try the following:

int main(int argc, char *argv[])
{
    int rank, value, size,count;
    MPI_Status status;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
    MPI_Comm_size(MPI_COMM_WORLD, &size); 
    if (rank==0) 
    {
       printf("please input value= ");
       scanf("%d",&value);
       for(int i = 1; i < size; i++){
           MPI_Send(&value, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
           printf("%d send (%d)->-> %d\n",rank, value, i);
      }
    }
    else 
    {
       MPI_Recv(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
       printf("%d receive(%d)<-<- %d\n",rank, value, rank-1);
    }
    MPI_Finalize();
}

Process 0 send a value to all the remaining processes:

   for(int i = 1; i < size; i++){
       MPI_Send(&value, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
       printf("%d send (%d)->-> %d\n",rank, value, i);
  }

and all the remaining processes receive a message from the process 0:

MPI_Recv(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);

Upvotes: 1

Related Questions