Reputation: 661
For the problem I'd like to discuss, let's take MPI_Barrier
as an example. The MPI3 standard states
If comm is an intracommunicator, MPI_BARRIER blocks the caller until all group members have called it. The call returns at any process only after all group members have entered the call.
So I was wondering - same essentially applies to all collective operations in general - how this assertion has to be interpreted in cases where some processes of the communication context just exited (successfully) prior to execution of MPI_Barrier
: For example, let's assume we have two processes A and B and use MPI_COMM_WORLD
as communicator and argument comm
to MPI_Barrier
. After A and B call MPI_Init
, if B immediately calls MPI_Finalize
and exits, and if only A calls MPI_Barrier
before calling MPI_Finalize
, is A blocked for eternity? Or is the set of "all group members" defined as the set of all original group members which have not exited, yet? I'm pretty sure A is blocked forever, but maybe the MPI standard has more to say about this?
REMARK: This is not a question about the synchronizing properties of MPI_Barrier
, the reference to MPI_Barrier
is merely meant to be a concrete example. It is a question about MPI program correctness if collective operations are performed. See the comments.
Upvotes: 2
Views: 130
Reputation: 22670
If B exits right at program start and only A calls MPI_Barrier, is A blocked for eternity?
Basically yes. But actually, you are not allowed to do that.
Simply speaking, you must call MPI_Finalize
on all processes before exiting. And MPI_Finalize
acts like a collective (on MPI_COMM_WORLD
), so it usually does not complete before every process calls MPI_Finalize
. So in your example, process B didn't exit (at least not correctly).
But I guess the MPI 3.1 standard at 8.7 explains it more clearly:
MPI_Finalize
[...] This routine cleans up all MPI state. If an MPI program terminates normally (i.e., not due to a call toMPI_ABORT
or an unrecoverable error) then each process must callMPI_FINALIZE
before it exits. Before an MPI process invokesMPI_FINALIZE
, the process must perform all MPI calls needed to complete its involvement in MPI communications: It must locally complete all MPI operations that it initiated and must execute matching calls needed to complete MPI communications initiated by other processes.
Note how the last sentence also requires you to complete the barrier in your question.
The standard says, your program is not correct. In practice it will most likely deadlock/hang.
Upvotes: 4