Reputation: 41
I have a MPI_Isend/MPI_Recv problem in a multi-thread program.
In the program:
The first machine has one thread does some computation and call MPI_Isend
to send buffer to the second machine, and another thread is always trying to MPI_Recv
data from the second machine. And the first thread will MPI_Wait
its last MPI_Isend
to complete before call MPI_Isend
again.
The second machine does the exact same thing.
Then I got the result that :
The first machine:
Thread 0: MPI_Isend
data to the second machine successfully. but blocked in MPI_Wait
because last MPI_Isend
did not complete.
Thread 1: try to MPI_Recv
data from the second machine, but no data and it blocked.
The second machine:
Thread 0: MPI_Isend
data to the first machine successfully. but blocked in MPI_Wait
because last MPI_Isend
did not complete.
Thread 1: try to MPI_Recv
data from the first machine, but no data and it blocked.
Does anyone have any ideas? I appreciated it very much, because I have tracked the problem for two days but no progress.
Upvotes: 2
Views: 1661
Reputation: 74495
In order to be able to execute MPI calls concurrently, you have to initialise the MPI library with support for threads at level MPI_THREAD_MULTIPLE
. To do so you have to replace the call to MPI_Init
with:
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
if (provided != MPI_THREAD_MULTIPLE)
{
printf("Sorry, this MPI implementation does not support multiple threads\n");
MPI_Abort(MPI_COMM_WORLD, 1);
}
Some MPI libraries have to be compiled in a certain (non-default) way in order to support calls from multiple threads. For example, Open MPI has to be configured at library build time. Other vendors provide two versions of their libraries - one with support for threads and one without and you have to chose the correct one when you link your code. This is because adding support for threads increases the latency of many MPI calls and no one wants that if his program does not use threads.
Upvotes: 2