Antoine Motte
Antoine Motte

Reputation: 23

What's the most efficient way to use MPI RDMA in read-only?

I'm currently dealing with a situation which should be really fast to handle with MPI one-sided communications, but I'm struggling to find similar examples and I'm not sure to have made the optimal choices.

Each MPI process owns a very large vector, and needs to access data from other processes' vectors in an unpredictable scheme, that means I can't know in advance the frequency and the sizes of the messages, nor the exact moment these data will be needed. The huge advantage I have is that these vector will stay constant during the entire function call where these communication will happen, so I don't have to care about synchronization between read and write instructions.

So to be more precise, I need for each process :

  1. An initial synchronization with all other processes to create a Window on the vector
  2. Alternating between remote reads on all other processes and data processing, as many times as needed
  3. A final synchronization with all other processes to close the window

At the moment I went for this code structure :

//Initial sync
MPI_Win window;
MPI_Win_create(..., &window);
MPI_Win_fence(0, window);

//reading and processing data
MPI_Win_lock_all(MPI_MODE_NOCHECK, window);

while (keep_on)
{
    //loop over every process excepted world_rank
    for (int offset = 1; offset < world_size; ++offset)
    {
        int target = (world_rank + offset)%world_size;
        MPI_Get(..., window);
    }

    MPI_Win_flush_local_all(window);
    //process data and update keep_on
    [...]
}

MPI_Win_unlock_all(window)

//final sync
MPI_Win_fence(0, window);
MPI_Win_free(&window);

And now I have some remarks/questions :

  1. I used lock and unlock function only because they are needed to call flush, but I don't really see the point of these calls, so is there a better approach?
  2. I tried to use MPI_Rget/MPI_Wait instead of lock-get-flush-unlock but couldn't figure how to make it work, maybe some additional function calls are needed. I got this error :
An error occurred in MPI_Rget
reported by process [4181227537,28]
on win ucx window 3
MPI_ERR_RMA_SYNC: error executing rma sync
  1. Is the MPI_MODE_NOCHECK useful here? I'm a bit confused not to precise anywhere an MPI_NO_PUT as it would allow MPI to understand that I want a read-only access to the window.

Thank you for reading this post

Upvotes: 0

Views: 135

Answers (0)

Related Questions