Reputation: 508
I am pretty new in MPI and trying to get understand the sense by writing a simple C program. All I want to do is to split an array and send blocks to N processors. So, each processor will find local mins in their blocks. Then the program (in root or somewhere else) finds the global min.
I studied on MPI_Send
, MPI_Isend
, or MPI_Bcast
functions but a little bit confused in where to use one instead of another. I need some tips for the general structure of my program:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array
int main(int argc, char *argv[]) {
int i, k = 0, size, rank, source = 0, dest = 1, count;
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
count = N/(size-1); // think size = 4 for this example
int *tempArray = malloc(count * sizeof(int));
int *localMins = malloc((size-1) * sizeof(int));
if (rank == 0) {
for(i=0; i<size; i+=count)
{
// Is it better to use MPI_Isend or MPI_Bcast here?
MPI_Send(&A[i], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
dest++;
}
}
else {
for(i=0; i<size; i+=count)
{
MPI_Recv(tempArray, count, MPI_INT, 0, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMins[k] = findMin(tempArray, count);
printf("Min for P%d is %d.\n", rank, localMins[k]);
k++;
}
}
MPI_Finalize();
int gMin = findMin(localMins, (size-1)); // where should I assign this
printf("Global min: %d\n", gMin); // and where should I print the results?
return 0;
}
There could be multiple errors on my code and sorry for cannot specify an exact problem here. Thanks for any suggestion.
Upvotes: 0
Views: 7562
Reputation: 4623
There are several problems with the code you have (as you've already pointed out), and as some commenters have already mentioned, there are alternative ways to perform what you are trying to do using MPI calls.
However, I am going to repurpose your code and try not to change too much in order to show you what's going on.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array that should only be initialized on rank == ROOT
int main(int argc, char *argv[]) {
int size;
int rank;
const int VERY_LARGE_INT = 999999;
const int ROOT = 0; // the master rank that holds A to begin with
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size); // think size = 4 for this example
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
/*
How many numbers you send from ROOT to each other rank.
Note that for this implementation to work, (size-1) must divide N.
*/
int count = N/(size-1);
int *localArray = (int *)malloc(count * sizeof(int));
int localMin; // minimum computed on rank i
int globalMin; // will only be valid on rank == ROOT
/* rank == ROOT sends portion of A to every other rank */
if (rank == ROOT) {
for(int dest = 1; dest < size; ++dest)
{
// If you are sending information from one rank to another, you use MPI_Send or MPI_Isend.
// If you are sending information from one rank to ALL others, then every rank must call MPI_Bcast (similar to MPI_Reduce below)
MPI_Send(&A[(dest-1)*count], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
}
localMin = VERY_LARGE_INT; // needed for MPI_Reduce below
}
/* Every other rank is receiving one message: from ROOT into local array */
else {
MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMin = findMin(localArray, count);
printf("Min for P%d is %d.\n", rank, localMin);
}
/*
At this point, every rank in communicator has valid information stored in localMin.
Use MPI_Reduce in order to find the global min among all ranks.
Store this single globalMin on rank == ROOT.
*/
MPI_Reduce(&localMin, &globalMin, 1, MPI_INT, MPI_MIN, ROOT, MPI_COMM_WORLD);
if (rank == ROOT)
printf("Global min: %d\n", globalMin);
/* The last thing you do is Finalize MPI. Nothing should come after. */
MPI_Finalize();
return 0;
}
Full disclosure: I have not tested this code, but besides minor typos, it should work.
Look over this code and see if you can understand why I moved your MPI_Send
and MPI_Recv
calls around. To understand this, note that every rank is reading every line of code you give it. Thus, inside your else
statement, there should not be a for
loop of receives.
Furthermore, the MPI collectives (such as MPI_Reduce
and MPI_Bcast
), must be called by every rank in a communicator. The "source" and "destination" ranks for these calls are part of the function input parameters or implied by the collective itself.
Lastly, a little bit of homework for you: can you see why this is NOT a good implementation of finding the global min of the array A
? HINT: what is rank == ROOT
doing after it finishes its MPI_Send
s? How would you better split up this problem so that every rank is more evenly performing work?
Upvotes: 3