Better way to put different parts of an array spread by all processes into a single final array with MPI in C

Question

I put this code only as an example so that you can understand what I am looking for:

  double *f = malloc(sizeof(double) * nx * ny);
  double *f2 = malloc(sizeof(double) * nx * ny);
  for ( i = process * (nx/totalProcesses); i < (process + 1) * (nx/totalProcesses); i++ )
  {
    for ( j = 0; j < ny; j++ )
    {
          f2[i*ny + j] = j*i;
    }
  }
  MPI_Allreduce( f2, f, nx*ny, MPI_DOUBLE, MPI_SUM, MPI_COMM);

And yes, it works, in the end I have the correct result in 'f' and that is what I want, but I would like to know if there is a better or more direct way to achieve the same in order to get efficiency. I tried it with allgather but couldn't get correct result.

dreamcrash · Accepted Answer

but I would like to know if there is a better or more direct way to achieve the same in order to get efficiency.

No, in the given context, using a MPI collective routine is (in theory) always more efficient than the alternative send/recv. Although is not imposed by the MPI standard a good implementation of it, however, implements MPI collective routines like MPI_Allreduce in log(p) steps (with p being the number of process).

Bear in mind, however, that MPI_Allreduce:

Combines values from all processes and distributes the result back to all processes.

Therefore, if you do need the result in all the processes you can use MPI_Reduce:

Reduces values on all processes to a single value

Better way to put different parts of an array spread by all processes into a single final array with MPI in C

Answers (1)

Related Questions