ThatsRightJack
ThatsRightJack

Reputation: 751

Unknown error with exchanging halo/ghost cells using MPI (C)

I'm new to MPI so go easy on me ... anyway, I'm trying to use MPI_Isend and MPI_Irecv for non-blocking communication. I wrote a subroutine called "halo_exchange" which I'd like to call each time I need to exchange halo cells between neighboring sub-domains. I'm able to split the domain up properly and I know each of my neighbor ranks. In the code below, the neighbors are oriented North/South (i.e. I use a 1D row decomposition). All processes are used in the computation. In other words, all processes will call this subroutine and need to exchange data.

Originally I was using a set of MPI_Isend/MPI_Irecv calls for both the North and South boundaries, but then I split it up thinking maybe there was something wrong with passing "MPI_PROC_NULL" to the functions (boundaries are not periodic). That is the reason for the if statements. The code continues to get hung up on the "MPI_Waitall" statements and I don't know why? It literally just waits, and I'm not sure what it's waiting for?

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

//---------------------------------------------------------------------------------------
// FUNCTION "halo_exchange"
//---------------------------------------------------------------------------------------
void halo_exchange(PREFIX **array, MPI_Comm topology_comm,              \
       int nn, int S_neighbor, int N_neighbor)
{
  int halo = 2;
  int M = 20;

  ...

  double *S_Recv,*N_Recv;
  double *S_Send,*N_Send;

  // Receive buffers
  S_Recv = (double *) calloc( M*halo,sizeof(double) );
  N_Recv = (double *) calloc( M*halo,sizeof(double) );

  // Send buffers
  S_Send = (double *) calloc( M*halo,sizeof(double) );
  N_Send = (double *) calloc( M*halo,sizeof(double) );

  ...
  // send buffers filled with data
  // recv buffers filled with zeros (is this ok...or do I need to use malloc?)
  ...

  if (S_neighbor == MPI_PROC_NULL)
  {
  MPI_Status status[2];
  MPI_Request req[2];

  MPI_Isend(&N_Send,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[0]);
  MPI_Irecv(&N_Recv,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[1]);
      ...
      ...
  MPI_Waitall(2,req,status);

  }
  else if (N_neighbor == MPI_PROC_NULL)
  {
  MPI_Status status[2];
  MPI_Request req[2];

  MPI_Isend(&S_Send,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[0]);
  MPI_Irecv(&S_Recv,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[1]);
      ...
      ...
  MPI_Waitall(2,req,status);

  }
  else
  {
  MPI_Status status[4];
  MPI_Request req[4];

  MPI_Isend(&S_Send,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[0]);
  MPI_Isend(&N_Send,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[1]);

  MPI_Irecv(&N_Recv,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[2]);
  MPI_Irecv(&S_Recv,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[3]);
      ...
      ...
  MPI_Waitall(4,req,status);

  }
  ...
}

This was my original understanding, which is obviously missing something: Since each process calls this subroutine, all send/recv functions are called. Then all processes will wait at their MPI_Waitall point for the corresponding communications to take place. When they are done it moves on....can someone tell me why mine isn't moving??? Also I'm not too clear on the "tag" argument (clue?) Thanks for all your help in advance!!!

Upvotes: 0

Views: 1072

Answers (2)

Jonathan Dursi
Jonathan Dursi

Reputation: 50937

This body of code

  MPI_Status status[4];
  MPI_Request req[4];

  MPI_Isend(&S_Send,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[0]);
  MPI_Isend(&N_Send,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[1]);

  MPI_Irecv(&N_Recv,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[2]);
  MPI_Irecv(&S_Recv,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[3]);
      ...
      ...
  MPI_Waitall(4,req,status);

is mostly fine, and you shouldn't have to if around the MPI_PROC_NULL neighbours; that's what MPI_PROC_NULL is for, so that you can just push the corner cases into the MPI routines themselves greatly simplifying the developer-facing communications code.

The issue here is in fact the tags. Tags are attached to individual messages. The tags can be any non-negative integer up to a certain max, but the key is that the sender and the receiver have to agree on the tag.

If you are sending your north neighbour some data with tag 2, that's fine, but now pretend that you're the north neighbour; you're going to receive that same message from your south neighbour with tag 2. Similarly, if you're going to send your south neighbour data with tag 1, that south neighbour is going to need to receive it from its north neighbour with tag 1.

So you actually want

  MPI_Isend(&S_Send,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[0]);
  MPI_Isend(&N_Send,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[1]);

  MPI_Irecv(&N_Recv,halo*M,MPI_DOUBLE,N_neighbor,1,topology_comm,&req[2]);
  MPI_Irecv(&S_Recv,halo*M,MPI_DOUBLE,S_neighbor,2,topology_comm,&req[3]);

Update based on OPs comment below:

and in fact, since S_Recv etc. are already pointers to the data, as:

  S_Recv = (double *) calloc( M*halo,sizeof(double) );

what you really want is:

  MPI_Isend(S_Send,halo*M,MPI_DOUBLE,S_neighbor,1,topology_comm,&req[0]);
  MPI_Isend(N_Send,halo*M,MPI_DOUBLE,N_neighbor,2,topology_comm,&req[1]);

  MPI_Irecv(N_Recv,halo*M,MPI_DOUBLE,N_neighbor,1,topology_comm,&req[2]);
  MPI_Irecv(S_Recv,halo*M,MPI_DOUBLE,S_neighbor,2,topology_comm,&req[3]);

Upvotes: 4

Hristo Iliev
Hristo Iliev

Reputation: 74455

Besides getting the tags correctly, you could further improve your code. The data communication operation that you are implementing using non-blocking operations is so common, that MPI provides its own call to do that - MPI_SENDRECV. With it your code would be simplified to:

MPI_Sendrecv(&S_Send, halo*M, MPI_DOUBLE, S_neighbor, 0,
             &N_Recv, halo*M, MPI_DOUBLE, N_neighbor, 0,
             topology_comm, MPI_STATUS_IGNORE);
MPI_Sendrecv(&N_Send, halo*M, MPI_DOUBLE, N_neighbor, 0,
             &S_Recv, halo*M, MPI_DOUBLE, S_neighbor, 0,
             topology_comm, MPI_STATUS_IGNORE);

Several points here:

  • You don't need to use separate tags for communications going in different directions. It only leads to confusion as in your original question.
  • Replacing your MPI_ISEND/MPI_IRECV scheme with the sequence of MPI_SENDRECV as outlined allows you to easily extend the halo swap to 2D, 3D and anyD cases. Of course you can still use non-blocking sends, but if done sequentially with MPI_SENDRECV, it automagically moves the diagonal elements to the respective halo too (i.e. it moves the topmost leftmost local element to the halo of the top-left diagonal neighbour in 2D).

Upvotes: 2

Related Questions