Pri
Pri

Reputation: 25

MPI send and receive deadlock

I'm very new to MPI and I'm just writing a basic send and receive module in which I'm sending 12 months to n number of processors and receiving each Month and printing its values. So I'm able to send the values correctly and also able to receive all of them but my program is stuck i.e It is not printing "After program is complete" at the last. Can you please help.

#include <stdio.h>
#include <string.h>
#include "mpi.h"
#include<math.h>

int main(int argc, char* argv[]){
int  my_rank; /* rank of process */
int  p;       /* number of processes */

int tag=0;    /* tag for messages */

MPI_Status status ;   /* return status for receive */
int i;
int pro;
/* start up MPI */

MPI_Init(&argc, &argv);

// find out process rank
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); 

//find out number of processes
MPI_Comm_size(MPI_COMM_WORLD, &p); 
if (my_rank==0)
{
    for(i=1;i<=12;i++)
    {
        pro = (i-1)%p;
        MPI_Send(&i, 1, MPI_INT,pro, tag, MPI_COMM_WORLD);
        printf("Value of Processor is %d Month %d\n",pro,i);
    }
}

//else{
for(int n=0;n<=p;n++)
{

    MPI_Recv(&i, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
    printf("My Month is %d and rank is %d\n",i,my_rank);

}
//}
MPI_Barrier(MPI_COMM_WORLD);
if(my_rank==0)
{
    printf("After program is complete\n");
}
/* shut down MPI */

MPI_Finalize(); 
return 0;
}

Below is the output:
Value of Processor is 0 Month 1
Value of Processor is 1 Month 2
Value of Processor is 2 Month 3
Value of Processor is 3 Month 4
Value of Processor is 4 Month 5
Value of Processor is 0 Month 6
Value of Processor is 1 Month 7
Value of Processor is 2 Month 8
Value of Processor is 3 Month 9
Value of Processor is 4 Month 10
Value of Processor is 0 Month 11
My Month is 2 and rank is 1
My Month is 7 and rank is 1
My Month is 3 and rank is 2
My Month is 8 and rank is 2
Value of Processor is 1 Month 12
My Month is 1 and rank is 0
My Month is 6 and rank is 0
My Month is 11 and rank is 0
My Month is 12 and rank is 1
My Month is 4 and rank is 3
My Month is 9 and rank is 3
My Month is 5 and rank is 4
My Month is 10 and rank is 4

Upvotes: 1

Views: 1195

Answers (1)

Zulan
Zulan

Reputation: 22660

First: You violate one of the basic rules of MPI, there you must match one send with one receive.

In your example run, you run with 5 processors (ranks) and as you can see rank 0 sends 3 messages to ranks 0 and 1 and 2 messages to the remaining ranks. However, each rank posts 13 receives. So they will naturally get stuck waiting for a messages that are never sent. Remember, that the code in the loop around MPI_Recv is executed by all ranks. So there will be a total of 5 * 13 receives.

You can fix that by filtering inside the loop if it is your turn to receive. But it depends if you actually know beforehand how many messages the rank 0 is going to send - you may need more complicated mechanisms.

Second: You rank 0 sends a blocking message to itself (without posting a non-blocking receive first). That can already cause a deadlock. Remember that a MPI_Send is never guaranteed to return before the matching receive was posted, even though it sometimes may in practice.

Third: That loop for(int n=0;n<=p;n++) runs 13 times. You most certainly didn't want that, even though it isn't correct if you run it 12 times.

Finally: For the specific example, the preferred solution would be to save the months inside an array and spread it around all processes using MPI_Scatterv.

Upvotes: 1

Related Questions