MPI segmentation fault with MPI_Bcast communication

Question

I have an MPI program which works fine if I use one node, and if I have multiple nodes with communications off it also runs fine (albeit with the wrong numerical result since the calculation needs the communication). However I get an address not mapped segmentation fault when I run it with communications.

I have cut down my program to the basic outline below. I have a struct defined outside my main program. We then have an array of these structs, this array of structs is divided amongst the MPI nodes, for example we may have 500 particles and 5 nodes so the first node operates on 0-99 and so on. However after each operation all the nodes need to synchronise their data, I try to achieve this with MPI_Bcast.

Is there a problem with using a global struct in communication between MPI nodes? Or why else might this communication structure give a segmentation fault.

#define BUFFER 2

typedef struct {
    double X[BUFFER];
} Particle;

Particle *body;

void Communication(int i_buf, int size){

  // there are N particles which have been operated on by the various MPI nodes
  // we seek to synchronize this data between all nodes
  #pragma omp parallel for
  for(int nbody = 0; nbody < N; nbody++)
  {
    // below returns the node number which operated on this particle  
    int node = IsNode(nbody, size);

    MPI_Bcast(&(body[nbody].X[i_buf]), 1, MPI_DOUBLE, node, MPI_COMM_WORLD);
  }

}

void main() {

MPI_Init(&argc, &argv);

int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

body = (Particle*) calloc((size_t)N, sizeof(Particle));

  int adjust_a, adjust_b, a, b;
  adjust_a = Min(rank, N % size);
  adjust_b = Min(rank + 1, N % size);
  a = N / size * rank + adjust_a;
  b = N / size * (rank + 1) + adjust_b;


  int i_buf, i_buf_old;

  for(int i_time = 0; i_time <= time_steps; i_time++)
  {

    i_buf = Mod(i_time, BUFFER);
    i_buf_old = Mod(i_time - 1, BUFFER);

    #pragma omp sections
    {
      #pragma omp section
      {
        Compute(i_buf, i_buf_old, a, b); // just does some calc
        Communication(i_buf, size);
      }
      #pragma omp section
      {
      if((i_time != 0) && (rank == 0))
        LogThread(i_buf_old); // just writes to a file
      }
    }

  }

My full struct is:

typedef struct {
    double mass;
    double X[BUFFER];
    double Y[BUFFER];
    double Z[BUFFER];
    double Vx[BUFFER];
    double Vy[BUFFER];
    double Vz[BUFFER];
    double Fx[BUFFER];
    double Fy[BUFFER];
    double Fz[BUFFER];
} Particle;

Wesley Bland · Accepted Answer

It sounds like you might be using the wrong tool for the job. If you're trying to aggregate an array of data from all of the processes and make sure that all processes have the result of that aggregation, you should probably be using MPI_ALLGATHER.

This function does pretty much what you want. If you're in a situation where each processes has some data (an integer for instance) like this:

When you do the allgather, the result will be this:

If, for instance, you only needed the data to be aggregated in one place (such as rank 0), you could just use MPI_GATHER, which would look like this:

MPI segmentation fault with MPI_Bcast communication

Answers (1)

Related Questions