Reputation: 335
In my Matrix Addition code, I am transmitting the lower bound to the other processes with ISend and Tag 1, but when I compile the code all other slave processes claim to have the same lower bound. I don't understand why?
The Output:
I am process 1 and I received 1120 as lower bound
I am process 1 and my lower bound is 1120 and my upper bound is 1682
I am process 2 and I received 1120 as lower bound
I am process 2 and my lower bound is 1120 and my upper bound is 1682
Process 0 here: I am sending lower bound 0 to process 1
Process 0 here: I am sending lower bound 560 to process 2
Process 0 here: I am sending lower bound 1120 to process 3
Timings : 13.300698 Sec
I am process 3 and I received 1120 as lower bound
I am process 3 and my lower bound is 1120 and my upper bound is 1682
The code:
#define N_ROWS 1682
#define N_COLS 823
#define MASTER_TO_SLAVE_TAG 1 //tag for messages sent from master to slaves
#define SLAVE_TO_MASTER_TAG 4 //tag for messages sent from slaves to master
void readMatrix();
int rank, nproc, proc;
double matrix_A[N_ROWS][N_COLS];
double matrix_B[N_ROWS][N_COLS];
double matrix_C[N_ROWS][N_COLS];
int low_bound; //low bound of the number of rows of [A] allocated to a slave
int upper_bound; //upper bound of the number of rows of [A] allocated to a slave
int portion; //portion of the number of rows of [A] allocated to a slave
MPI_Status status; // store status of a MPI_Recv
MPI_Request request; //capture request of a MPI_Isend
int main (int argc, char *argv[]) {
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
double StartTime = MPI_Wtime();
// -------------------> Process 0 initalizes matrices and sends work portions to other processes
if (rank==0) {
readMatrix();
for (proc = 1; proc < nproc; proc++) {//for each slave other than the master
portion = (N_ROWS / (nproc - 1)); // calculate portion without master
low_bound = (proc - 1) * portion;
if (((proc + 1) == nproc) && ((N_ROWS % (nproc - 1)) != 0)) {//if rows of [A] cannot be equally divided among slaves
upper_bound = N_ROWS; //last slave gets all the remaining rows
} else {
upper_bound = low_bound + portion; //rows of [A] are equally divisable among slaves
}
//send the low bound first without blocking, to the intended slave
printf("Process 0 here: I am sending lower bound %i to process %i \n",low_bound,proc);
MPI_Isend(&low_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &request);
//next send the upper bound without blocking, to the intended slave
MPI_Isend(&upper_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &request);
//finally send the allocated row portion of [A] without blocking, to the intended slave
MPI_Isend(&matrix_A[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, proc, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &request);
}
}
//broadcast [B] to all the slaves
MPI_Bcast(&matrix_B, N_ROWS*N_COLS, MPI_DOUBLE, 0, MPI_COMM_WORLD);
// -------------------> Other processes do their work
if (rank != 0) {
//receive low bound from the master
MPI_Recv(&low_bound, 1, MPI_INT, 0, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &status);
printf("I am process %i and I received %i as lower bound \n",rank,low_bound);
//next receive upper bound from the master
MPI_Recv(&upper_bound, 1, MPI_INT, 0, MASTER_TO_SLAVE_TAG + 1, MPI_COMM_WORLD, &status);
//finally receive row portion of [A] to be processed from the master
MPI_Recv(&matrix_A[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, 0, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &status);
printf("I am process %i and my lower bound is %i and my upper bound is %i \n",rank,low_bound,upper_bound);
//do your work
for (int i = low_bound; i < upper_bound; i++) {
for (int j = 0; j < N_COLS; j++) {
matrix_C[i][j] = (matrix_A[i][j] + matrix_B[i][j]);
}
}
//send back the low bound first without blocking, to the master
MPI_Isend(&low_bound, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG, MPI_COMM_WORLD, &request);
//send the upper bound next without blocking, to the master
MPI_Isend(&upper_bound, 1, MPI_INT, 0, SLAVE_TO_MASTER_TAG + 1, MPI_COMM_WORLD, &request);
//finally send the processed portion of data without blocking, to the master
MPI_Isend(&matrix_C[low_bound][0], (upper_bound - low_bound) * N_COLS, MPI_DOUBLE, 0, SLAVE_TO_MASTER_TAG + 2, MPI_COMM_WORLD, &request);
}
// -------------------> Process 0 gathers the work
...
Upvotes: 2
Views: 2359
Reputation: 9817
MPI_Isend()
begins a non-blocking send. Hence, modifiying the buffer that is sent without checking that the message was actually sent result in wrong values being sent.
This is what happens in the piece of code you provided, in the loop on process for (proc = 1; proc < nproc; proc++)
proc=1 : low_bound
is computed.
proc=1 : low_bound
is sent (non-blocking) to process 1.
proc=2 : low_bound
is modified. The message is corrupted.
Different solutions exist:
Use blocking send MPI_Send()
.
Check that the message are completed by creating an array of 3 requests MPI_Request requests[3]; MPI_Status statuses[3];
, use non-blocking send and use MPI_Waitall()
to check the completion of the requests.
MPI_Isend(&low_bound, 1, MPI_INT, proc, MASTER_TO_SLAVE_TAG, MPI_COMM_WORLD, &requests[0]);
MPI_Isend(..., &requests[1]);
MPI_Isend(..., &requests[2]);
MPI_Waitall(3, requests, statuses);
Take a look at MPI_Scatter()
and MPI_Scatterv()
!
The "usual" way to do this is to MPI_Bcast()
the size of the matrix. Then each process computes the size of its part of the matrix. Process 0 computes the sendcounts
and displs
needed by MPI_Scatterv()
.
Upvotes: 3