When is an additional call to MPI_Win_sync() required when using MPI-3 shared memory operations?

Question

Documentation and examples concerning MPI-3 shared memory operations often include statements like the following, copied from a recent presentation from Intel (my emphasis):

// Start passive RMA epoch
MPI_Win_lock_all (MPI_MODE_NOCHECK, win);

// write into mem array hello_world info
mem[0] = rank;
mem[1] = numtasks;
memcpy(mem+2, name, namelen);
MPI_Win_sync (win);    // memory fence - sync node exchanges
MPI_Barrier (shmcomm); //time barrier

Passive RMA synchronizations are needed for MPI SHM updates. The performance assertion MPI_MODE_NOCHECK hints that the epoch can begin immediately at the target. Note that on some platforms one more MPI_Win_sync would be needed after the MPI_Barrier to ensure memory consistency at the reader side.

Which platforms require this additional call to MPI_Win_sync()? What characteristics of these platforms lead to this requirement? Do I need this additional call on "standard" systems?

When is an additional call to MPI_Win_sync() required when using MPI-3 shared memory operations?

Answers (1)

Related Questions