MPI send / receive programme never completes

Question

I just spent a while writing a long answer to someone else's question, only for it to be deleted before I could post the answer. Didn't want the effort to be wasted, so I'm posting the question and answer here.

It's not simply a standard answer about send/receives deadlocking, because I also found an interesting half-solution that works only on certain compilers

In a parallel course, we need to do an exercise based on master-slave design pattern where master process 0 send a message to all his slave which will re-send the message to their right and left neighbour (processor id +/- 1, except for processor 0 which doesn't have left neighbour and for the last processor id which doesn't have right neighbour). After re-passing the message to the neighbour(s), the slave processor send a conformation that the job is end, to the master one.

The exercise is simple, but there's a problem in my code, because I receive the confirmation end message at the begin of my program... I don't undertand what's the problem here. I tried with fflush, but actually the last line of the program should be written to the console only after the receptions.

Someone has any idea? I am new to MPI/C concepts, so maybe there's something wrong in what I do?

#include 
#include 
#include 
#include 

int main(int argc, char *argv[]){
    int np, myId;
    char send[100], recv[100];

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &np);
    MPI_Comm_rank(MPI_COMM_WORLD, &myId);

    MPI_Status stat;
    if(myId == 0){
        int t = sprintf(send, "hey!"); //MPI_get_processor_name
        for(int i = 1; i < np; i++){
            printf("send %d => %d
", myId, i);
            fflush(stdout);
            MPI_Send(send, 50, MPI_CHAR, i, 0, MPI_COMM_WORLD);
        }

        for(int i = 1; i < np; i++){
            MPI_Recv(recv, 50, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat);
            printf("%s
", recv);
            fflush(stdout);
        }


    }else{
        if(myId < (np - 1)){
            printf("send %d => %d
", myId, myId + 1);
            fflush(stdout);
            MPI_Send(send, 50, MPI_CHAR, myId + 1, 0, MPI_COMM_WORLD);
        }

        if(myId > 1){
            printf("Envoie %d => %d
", myId, myId - 1);
            fflush(stdout);
                    MPI_Send(send, 50, MPI_CHAR, myId - 1, 0, MPI_COMM_WORLD);
        }

        MPI_Recv(send, 50, MPI_CHAR, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &stat); 

        printf("Réception %d <= %d
", myId, 0);
        fflush(stdout);

        if(myId != (np - 1)){
            MPI_Recv(send, 50, MPI_CHAR, myId + 1, 0, MPI_COMM_WORLD, &stat);
            printf("Receive %d <= %d
", myId, myId + 1);
            fflush(stdout);
        }

        if(myId != 1){
            MPI_Recv(send, 50, MPI_CHAR, myId - 1, 0, MPI_COMM_WORLD, &stat);
            printf("Receive %d <= %d
", myId, myId - 1);
            fflush(stdout);
        }

        int t = sprintf(recv, "End for %d.", myId);
        MPI_Send(recv, 50 , MPI_CHAR, 0, 0, MPI_COMM_WORLD); 
    }

    MPI_Finalize();
    return 0;
}

hcarver · Accepted Answer

Solution 1

Let's compare what all of the non-0, "slave" cores are actually doing with what you say they should do.

What you want them to do:

master process 0 send a message to all his slave which will re-send the message to their right and left neighbour (processor id +/- 1, except for processor 0 which doesn't have left neighbour and for the last processor id which doesn't have right neighbour). After re-passing the message to the neighbour(s), the slave processor send a conformation that the job is end, to the master one.

Code outline:

Send_To_Right_Neighbour();

Send_To_Left_Neighbour();

Receive_From_Master();

Receive_From_Right_Neighbour();

Receive_From_Left_Neighbour();

Send_To_Master();

See the difference? The slaves aren't receiving the message from the master before re-sending it to their neighbours. Changing the code to:

Receive_From_Master();

Send_To_Right_Neighbour();

Send_To_Left_Neighbour();

Receive_From_Right_Neighbour();

Receive_From_Left_Neighbour();

Send_To_Master();

will fix that and then the code runs to completion for me.

What was going wrong

MPI_Send can be a blocking function -- i.e. the call to MPI_Send won't return until the other process has called a matching MPI_Recv (though it doesn't have to be a blocking function). You should assume it will always be blocking when writing code.

Now let's imagine what non-0 processes do when you're running with >5 processes.

Process 1 sends to its right neighbour (process 2), and waits there until process 2 calls MPI_Recv.
Process 2 sends to its right neighbour (process 3), and waits there until process 3 calls MPI_Recv.
Process 3 sends to its right neighbour (process 4), and waits there until process 4 calls MPI_Recv.
...
Process n-2 sends to its right neighbour (process n-1), and waits there until process n-1 calls MPI_Recv
Process n-1 doesn't have a right neighbour, so moves on to sending to its left neighbour, and waits there until process n-2 calls MPI_Recv.

This is never going to happen, because process n-2 is busy waiting for process n-1 to receive its data before it tries to receive from n-1. This is a deadlock and neither process is going to budge.

Why does the solution work

I've said the above solution works for me -- but it's not perfect. The only change I made was to move the receiving from process 0 to be the first step -- why should that have affected the deadlocking?

The answer is that it shouldn't have affected the deadlocking at all. My guess is that the compiler has been clever enough to realise that the each core is sending and receiving to the same neighbours and combined the separate MPI_Send and MPI_Recv calls to the left and right neighbours into MPI_Sendrecv calls. This sends and receives to a neighbour in the same step, getting rid of the deadlocking problem. Previously, the call to receive from 0 was between the sends and receives to the same neighbour, so the compiler wasn't able to optimise it into a single operation.

But we don't want to rely on having a good compiler -- your code should work on any standard-compliant compiler -- so we should manually fix the deadlocking problem ourselves and not rely on the compiler being clever.

Solution 2

First of all, some comments on things that you may or may not have covered so far in your course

Process 0 is sending the same info to all the other cores. If you know MPI_Bcast you should use that instead of all these sends and receives.
Process 0 receives from all the other cores at the end. If you're willing to have more than one char array to receive into, you can do this very simply with a MPI_Gather.
I don't really understand the logic of the master process sending some data to every other process, then each process sharing that same data to each of its neighbours (which already have been given it by the master). It'd make more sense to me if the data that was shared was somehow different, or if the master process only sent data to some of the slaves, and they had to share it between themselves.

That said, let's talk about avoiding the deadlock. So, the root problem is that we have to make sure that whatever MPI_Send one process calls, another process can call a matching MPI_Recv at the same time without having to wait for the sending process to do anything else. The problem arises from each core trying to send at the same time.

So, one way we can fix it is to decide that the info is going to move completely in one direction first. I've chosen left-to-right. In that case, each slave core has to do:

Receive_From_Master();

// Make sure all info is sent from left to right
Send_To_Right_Neighbour();
// Make sure any info is received from left to right
Receive_From_Left_Neighbour();

// Now send all info from right to left
Send_To_Left_Neighbour();
// Make sure any info is received 
Receive_From_Right_Neighbour();

Send_To_Master();

What happens now is this:

process 2 starts sending to process 3
process 3 starts sending to process 4
...
process n-2 starts sending to process n-1
process n-1 has no right neighbour so moves on to receive from process n-2
process n-2 finishes sending to process n-1 so moves on to receiving from process n-3
...
process 3 finishes sending to process 4 and moves on to receiving from process 2.

The same will happen when sending left-to-right, except that now, process 1 has no left neighbour to send to , so can move straight on to receiving from process 2. There'll be no deadlock in either case.

MPI send / receive programme never completes

Answers (1)

Solution 1

What was going wrong

Why does the solution work

Solution 2

Related Questions