Valentin Grigorev
Valentin Grigorev

Reputation: 145

MPI Send Recv deadlock

I write programs using MPI and I have an access to two different clusters. I am not good in system administration, so I can not tell anything about software, OS, compilers which are used there. But, on one machine I have an deadlock using such code:

#include "mpi.h"
#include <iostream>

int main(int argc, char **argv) {

  int rank, numprocs;
  MPI_Status status;

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);

  int x = rank;

  if (rank == 0) {
      for (int i=0; i<numprocs; ++i)
          MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
  }
  MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);

  MPI_Finalize();
  return 0;
}

The error message is related:

Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(184): MPI_Send(buf=0x7fffffffceb0, count=1, MPI_INT, dest=0, tag=100500, MPI_COMM_WORLD) failed
MPID_Send(54): DEADLOCK: attempting to send a message to the local process without a prior matching receive

Why is that so? I can't understand, why does it happen on one machine, but doesn't happen on another?

Upvotes: 0

Views: 3790

Answers (2)

Hristo Iliev
Hristo Iliev

Reputation: 74405

Since rank 0 already has the correct value of x, you do not need to send it in a message. This means that in the loop you should skip sending to rank 0 and instead start from rank 1:

if (rank == 0) {
    for (int i=1; i<numprocs; ++i)
        MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);

Now rank 0 won't try to talk to itself, but since the receive is outside the conditional, it will still try to receive a message from itself. The solution is to simply make the receive the alternative branch:

if (rank == 0) {
    for (int i=1; i<numprocs; ++i)
        MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
else
    MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);

Another more involved solution is to use non-blocking operations to post the receive before the send operation:

MPI_Request req;

MPI_Irecv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &req);
if (rank == 0) {
    int xx = x;
    for (int i=0; i<numprocs; ++i)
        MPI_Send(&xx, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
MPI_Wait(&req, &status);

Now rank 0 will not block in MPI_Send as there is already a matching receive posted earlier. In all other ranks MPI_Irecv will be immediately followed by MPI_Wait, which is equivalent to a blocking receive (MPI_Recv). Note that the value of x is copied to a different variable inside the conditional as simultaneously sending from and receiving into the same memory location is forbidden by the MPI standard for obvious correctness reasons.

Upvotes: 1

Zulan
Zulan

Reputation: 22670

MPI_Send is a blocking operation. It may not complete until a matching receive is posted. In your case rank 0 is trying to send a message to itself before having posted a matching receive. If you must do something like this you would replace MPI_Send with MPI_Isend(+MPI_Wait...` after the receive). But you might as well just not make him send a message to itself.

The proper thing to use in your case is MPI_Bcast.

Upvotes: 1

Related Questions