Reputation: 722
This is an example from the MPI standard (version 4.1).
We have 3 processes and 3 communicators. The standard states that
It is not possible to perform a blocking collective operation on all communicators because there exists no deadlock-free order to invoke them. However, nonblocking collective operations can easily be used to achieve this task.
And provides this code snippet
MPI_Request reqs[2];
switch(rank) {
case 0:
MPI_Iallreduce(sbuf1, rbuf1, count, dtype, MPI_SUM, comm1, &reqs[0]);
MPI_Iallreduce(sbuf3, rbuf3, count, dtype, MPI_SUM, comm3, &reqs[1]);
break;
case 1:
MPI_Iallreduce(sbuf1, rbuf1, count, dtype, MPI_SUM, comm1, &reqs[0]);
MPI_Iallreduce(sbuf2, rbuf2, count, dtype, MPI_SUM, comm2, &reqs[1]);
break;
case 2:
MPI_Iallreduce(sbuf2, rbuf2, count, dtype, MPI_SUM, comm2, &reqs[0]);
MPI_Iallreduce(sbuf3, rbuf3, count, dtype, MPI_SUM, comm3, &reqs[1]);
break;
}
MPI_Waitall(2, reqs, MPI_STATUSES_IGNORE);
I see no deadlock in the blocking version of the same snippet, i.e. in
switch(rank) {
case 0:
MPI_Allreduce(sbuf1, rbuf1, count, dtype, MPI_SUM, comm1);
MPI_Allreduce(sbuf3, rbuf3, count, dtype, MPI_SUM, comm3);
break;
case 1:
MPI_Allreduce(sbuf1, rbuf1, count, dtype, MPI_SUM, comm1);
MPI_Allreduce(sbuf2, rbuf2, count, dtype, MPI_SUM, comm2);
break;
case 2:
MPI_Allreduce(sbuf2, rbuf2, count, dtype, MPI_SUM, comm2);
MPI_Allreduce(sbuf3, rbuf3, count, dtype, MPI_SUM, comm3);
break;
}
So the timing diagram would approximately be
Am I missing something? Maybe by "deadlock" the standard doesn't mean a complete standstill.
According to Wikipedia,
If a process remains indefinitely unable to change its state because resources requested by it are being used by another process that itself is waiting, then the system is said to be in a deadlock.
(emphasis by me)
Upvotes: 0
Views: 124
Reputation: 1
The standard is implying there is no guarantee this is deadlock free and assuming so is an incorrect program.
However, the MPI implementation is probably doing nonblocking operations under the covers such that is works but it's not standard compliant to assume that behavior and no guarantee some other MPI implementation will work or a more complicated example will work.
Upvotes: 0