Zlatan Sičanica
Zlatan Sičanica

Reputation: 355

Mpi4py mpi_test always returns false

I couldn't find a simmilar question here, so here goes: Why does the following code always output (False, None)? Shouldn't it be (True, None), if the test() was called 3 seconds after the process 0 send the message? Also, if I call req.wait() before the test() i get the output I need, but then it's not non-blockable, so test() loses it's purpose (I want to be able to tell that process 1 got a message from any source during those 3 seconds it slept)

Code:

import time
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()    

if rank == 0:
    req = comm.isend(0, 1, tag=0);
    req.wait();
elif rank == 1:
    req = comm.irecv();
    time.sleep(3);
    print req.test();

Upvotes: 2

Views: 1132

Answers (2)

David Henty
David Henty

Reputation: 1764

Maybe it's just because you've created this example from editing down a larger program, but I just wanted to check they're wasn't some underlying misunderstanding of non-blocking MPI comms ...

I don't understand why you have:

req = comm.isend(0, 1, tag=0);
req.wait();

since this is functionally identical to the blocking call

comm.send(0, 1, tag=0);

Of course, the non-blocking form means you can later insert more code between isend and wait which is perhaps what you mean to do.

Upvotes: 0

Gilles
Gilles

Reputation: 9489

I am no expert in mpi4py, but assuming it behaves like its MPI C counterpart (which seems like a fair assumption) then indeed, there's little surprise in here.

Well, in fairness, the output of your code is not stated by the MPI standard. What is guaranties is that after a number of calls to the MPI_Test() function, it will return true. This number can be anything, and therefore it return true at first call, or at second, or only after one billion calls... Therefore, the usual/recommended way of using the MPI_Test() function is to use it here and there, and to finish with either an infinite loop of it (with an exit condition based on its output), or to use a MPI_Wait() call.

Now, the reason for that is the following: the MPI library doesn't usually perform any action outside of explicit MPI calls. Therefore, in order to see a non-blocking communication to progress, you have to perform some MPI calls. These calls do not need to be related to the standing communications (usually any MPI call will internally trigger some level of progress of the message queue) but you need to give the hand to the MPI library in order to get that. And that's what the calls to MPI_Test() do. And that also explains why this isn't really time-related: your call to the sleep() function does indeed give time for the communication to happen, but since the MPI library doesn't get the hand in-between, nothing actually happened.

Finally, some moderation to my above explanations:

  1. The above supposes no external mechanism to progress the in-flight messages. However, (Remote) Direct Memory Access engines, such as those available on InfiniBand cards for example, can indeed progress the messages without having to make extra MPI calls. However, this usually would only happen for inter-node communications and is highly dependent on your hardware and software.
  2. Some MPI libraries offer as an extension the possibility to dedicate a CPU thread to progress MPI communication outside of MPI calls. Some MPICH-based MPI libraries, such as Intel MPI for example, propose the MPICH_ASYNC_PROGRESS environment variable which, once set to 1, will trigger the creation of this MPI communication thread to progress non-blocking communications behind the scene. Not sure if OpenMPI proposes this feature as well...

Upvotes: 1

Related Questions