Reputation: 83
After doing calculations to multiply a matrix with a vector using Cartesian topology. I got the following process with the their ranks and vectors.
P0 (process with rank = 0) =[2 , 9].
P1 (process with rank = 1) =[2 , 3]
P2 (process with rank = 2) =[1 , 9]
P3 (process with rank = 3) =[4 , 6].
Now. I need to sum the elements of the even rank processes and the odd ones separately, like this:
temp1 = [3 , 18]
temp2 = [6 , 9]
and then , gather the results in a different vector, like this:
result = [3 , 18 , 6 , 9]
My attampt to do it is to use the MPI_Reduce and then MPI_Gather like this :
// Previous code
double* temp1 , *temp2;
if(myrank %2 == 0){
BOOLEAN flag = Allocate_vector(&temp1 ,local_m); // function to allocate space for vectors
MPI_Reduce(local_y, temp1, local_n, MPI_DOUBLE, MPI_SUM, 0 , comm);
MPI_Gather(temp1, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE,0, comm);
free(temp1);
}
else{
Allocate_vector(&temp2 ,local_m);
MPI_Reduce(local_y, temp2, local_n , MPI_DOUBLE, MPI_SUM, 0 , comm);
MPI_Gather(temp2, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE, 0,comm);
free(temp2);
}
But the answer is not correct.It seemd that the code sums all elements of the even and odd process togather and then gives a segmentation fault error: Wrong_result = [21 15 0 0] and this error
** Error in
./test': double free or corruption (fasttop): 0x00000000013c7510 *** *** Error in
./test': double free or corruption (fasttop): 0x0000000001605b60 ***
Upvotes: 0
Views: 1319
Reputation: 74475
It won't work the way you are trying to do it. To perform reduction over the elements of a subset of processes, you have to create a subcommunicator for them. In your case, the odd and the even processes share the same comm
, therefore the operations are not over the two separate groups of processes but rather over the combined group.
You should use MPI_Comm_split
to perform a split, perform the reduction using the two new subcommunicators, and finally have rank 0 in each subcommunicator (let's call those leaders) participate in the gather over another subcommunicator that contains those two only:
// Make sure rank is set accordingly
MPI_Comm_rank(comm, &rank);
// Split even and odd ranks in separate subcommunicators
MPI_Comm subcomm;
MPI_Comm_split(comm, rank % 2, 0, &subcomm);
// Perform the reduction in each separate group
double *temp;
Allocate_vector(&temp, local_n);
MPI_Reduce(local_y, temp, local_n , MPI_DOUBLE, MPI_SUM, 0, subcomm);
// Find out our rank in subcomm
int subrank;
MPI_Comm_rank(subcomm, &subrank);
// At this point, we no longer need subcomm. Free it and reuse the variable.
MPI_Comm_free(&subcomm);
// Separate both group leaders (rank 0) into their own subcommunicator
MPI_Comm_split(comm, subrank == 0 ? 0 : MPI_UNDEFINED, 0, &subcomm);
if (subcomm != MPI_COMM_NULL) {
MPI_Gather(temp, local_n, MPI_DOUBLE, gResult, local_n, MPI_DOUBLE, 0, subcomm);
MPI_Comm_free(&subcomm);
}
// Free resources
free(temp);
The result will be in gResult
of rank 0 in the latter subcomm
, which happens to be rank 0 in comm
because of the way the splits are performed.
Not as simple as expected, I guess, but that is the price of having convenient collective operations in MPI.
On a side node, in the code shown you are allocating temp1
and temp2
to be of length local_m
, while in all collective calls the length is specified as local_n
. If it happens that local_n > local_m
, then heap corruption will occur.
Upvotes: 1