Reputation: 7234
I just spent a while writing a long answer to someone else's question, only for it to be deleted before I could post the answer. Didn't want the effort to be wasted, so I'm posting the question and answer here.
It's not simply a standard answer about send/receives deadlocking, because I also found an interesting half-solution that works only on certain compilers
In a parallel course, we need to do an exercise based on master-slave design pattern where master process 0 send a message to all his slave which will re-send the message to their right and left neighbour (processor id +/- 1, except for processor 0 which doesn't have left neighbour and for the last processor id which doesn't have right neighbour). After re-passing the message to the neighbour(s), the slave processor send a conformation that the job is end, to the master one.
The exercise is simple, but there's a problem in my code, because I receive the confirmation end message at the begin of my program... I don't undertand what's the problem here. I tried with fflush, but actually the last line of the program should be written to the console only after the receptions.
Someone has any idea? I am new to MPI/C concepts, so maybe there's something wrong in what I do?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <mpi.h>
int main(int argc, char *argv[]){
int np, myId;
char send[100], recv[100];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &np);
MPI_Comm_rank(MPI_COMM_WORLD, &myId);
MPI_Status stat;
if(myId == 0){
int t = sprintf(send, "hey!"); //MPI_get_processor_name
for(int i = 1; i < np; i++){
printf("send %d => %d\n", myId, i);
fflush(stdout);
MPI_Send(send, 50, MPI_CHAR, i, 0, MPI_COMM_WORLD);
}
for(int i = 1; i < np; i++){
MPI_Recv(recv, 50, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat);
printf("%s\n", recv);
fflush(stdout);
}
}else{
if(myId < (np - 1)){
printf("send %d => %d\n", myId, myId + 1);
fflush(stdout);
MPI_Send(send, 50, MPI_CHAR, myId + 1, 0, MPI_COMM_WORLD);
}
if(myId > 1){
printf("Envoie %d => %d\n", myId, myId - 1);
fflush(stdout);
MPI_Send(send, 50, MPI_CHAR, myId - 1, 0, MPI_COMM_WORLD);
}
MPI_Recv(send, 50, MPI_CHAR, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &stat);
printf("Réception %d <= %d\n", myId, 0);
fflush(stdout);
if(myId != (np - 1)){
MPI_Recv(send, 50, MPI_CHAR, myId + 1, 0, MPI_COMM_WORLD, &stat);
printf("Receive %d <= %d\n", myId, myId + 1);
fflush(stdout);
}
if(myId != 1){
MPI_Recv(send, 50, MPI_CHAR, myId - 1, 0, MPI_COMM_WORLD, &stat);
printf("Receive %d <= %d\n", myId, myId - 1);
fflush(stdout);
}
int t = sprintf(recv, "End for %d.", myId);
MPI_Send(recv, 50 , MPI_CHAR, 0, 0, MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
Upvotes: 3
Views: 4007
Reputation: 7234
Let's compare what all of the non-0, "slave" cores are actually doing with what you say they should do.
What you want them to do:
master process 0 send a message to all his slave which will re-send the message to their right and left neighbour (processor id +/- 1, except for processor 0 which doesn't have left neighbour and for the last processor id which doesn't have right neighbour). After re-passing the message to the neighbour(s), the slave processor send a conformation that the job is end, to the master one.
Code outline:
Send_To_Right_Neighbour();
Send_To_Left_Neighbour();
Receive_From_Master();
Receive_From_Right_Neighbour();
Receive_From_Left_Neighbour();
Send_To_Master();
See the difference? The slaves aren't receiving the message from the master before re-sending it to their neighbours. Changing the code to:
Receive_From_Master();
Send_To_Right_Neighbour();
Send_To_Left_Neighbour();
Receive_From_Right_Neighbour();
Receive_From_Left_Neighbour();
Send_To_Master();
will fix that and then the code runs to completion for me.
MPI_Send
can be a blocking function -- i.e. the call to MPI_Send
won't return until the other process has called a matching MPI_Recv
(though it doesn't have to be a blocking function). You should assume it will always be blocking when writing code.
Now let's imagine what non-0 processes do when you're running with >5 processes.
MPI_Recv
.MPI_Recv
.MPI_Recv
.MPI_Recv
MPI_Recv
.This is never going to happen, because process n-2 is busy waiting for process n-1 to receive its data before it tries to receive from n-1. This is a deadlock and neither process is going to budge.
I've said the above solution works for me -- but it's not perfect. The only change I made was to move the receiving from process 0 to be the first step -- why should that have affected the deadlocking?
The answer is that it shouldn't have affected the deadlocking at all. My guess is that the compiler has been clever enough to realise that the each core is sending and receiving to the same neighbours and combined the separate MPI_Send
and MPI_Recv
calls to the left and right neighbours into MPI_Sendrecv
calls. This sends and receives to a neighbour in the same step, getting rid of the deadlocking problem. Previously, the call to receive from 0 was between the sends and receives to the same neighbour, so the compiler wasn't able to optimise it into a single operation.
But we don't want to rely on having a good compiler -- your code should work on any standard-compliant compiler -- so we should manually fix the deadlocking problem ourselves and not rely on the compiler being clever.
First of all, some comments on things that you may or may not have covered so far in your course
MPI_Bcast
you should use that instead of all these sends and receives.MPI_Gather
.That said, let's talk about avoiding the deadlock. So, the root problem is that we have to make sure that whatever MPI_Send
one process calls, another process can call a matching MPI_Recv
at the same time without having to wait for the sending process to do anything else. The problem arises from each core trying to send at the same time.
So, one way we can fix it is to decide that the info is going to move completely in one direction first. I've chosen left-to-right. In that case, each slave core has to do:
Receive_From_Master();
// Make sure all info is sent from left to right
Send_To_Right_Neighbour();
// Make sure any info is received from left to right
Receive_From_Left_Neighbour();
// Now send all info from right to left
Send_To_Left_Neighbour();
// Make sure any info is received
Receive_From_Right_Neighbour();
Send_To_Master();
What happens now is this:
The same will happen when sending left-to-right, except that now, process 1 has no left neighbour to send to , so can move straight on to receiving from process 2. There'll be no deadlock in either case.
Upvotes: 6