Reputation: 33
I have an array of numbers that I need to scatter to each node in an MPI program. The setup is that I have an array of numbers from 1 to 100 with all the even numbers except the number 2 removed. Due to the way I removed the even numbers, the number 2 is the last element in the array.
So my array contains 51 odd numbers, 3, 5, 7, ... 99, 2. My problem is that the final partition after a scatter does not contain the last three numbers in the array - 97, 99 and 2.
int *oddsOnly = //array as setup above, 3,5,7,...99,2
int chunkSize = (oddsFound / worldSize);
int *localPartition = new int[chunkSize];
// Send everyone the chunk size
MPI_Bcast(&chunkSize, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Scatter(oddsOnly, chunkSize, MPI_INT, localPartition, chunkSize, MPI_INT, 0, MPI_COMM_WORLD);
I understand the issue is that the number of ranks and the array size don't divide evenly, I've tried
chunkSize = ceil(oddsFound / worldSize);
And
chunkSize = (oddsFound / worldSize) + 1;
But this then gives me duplicates in the split.
As it stands I get
0 scatter is 1 3 5 7 9 11 13 15 17 19 21 23
1 scatter is 25 27 29 31 33 35 37 39 41 43 45 47
2 scatter is 49 51 53 55 57 59 61 63 65 67 69 71
3 scatter is 73 75 77 79 81 83 85 87 89 91 93 95
Is it possible to do what I'm attemptimg tidily? I had a look at scatterV but I'm not sure its what I need. I should add I don't have to scatter, so maybe there is a better MPI way of doing it.
Upvotes: 1
Views: 987
Reputation: 51393
One approach is to use MPI_Scatter
and add padding to the array so that its size is evenly divisible among processes. However, whenever a process has to manipulate that array it needs to ignore the padding part.
Another approach is to use MPI_ScatterV, and "manually" divide the blocks among the processes.
An example of such approach:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc,char *argv[]){
MPI_Init(NULL,NULL); // Initialize the MPI environment
int world_rank;
int world_size;
MPI_Comm_rank(MPI_COMM_WORLD,&world_rank);
MPI_Comm_size(MPI_COMM_WORLD,&world_size);
int size = 50;
int *numbers= NULL;
if(world_rank == 0){
numbers = malloc(sizeof(int) * size);
for(int i = 0; i < size; i++)
numbers[i] = (2*i + 1);
}
// The Important Part
int sendcounts [world_size];
int displs [world_size];
int res = size % world_size;
int size_per_process = size / world_size;
int increment = 0;
for(int processID = 0; processID < world_size; processID++){
displs[processID] = increment;
sendcounts[processID] = (processID + 1 <= res) ? size_per_process + 1 : size_per_process;
increment += sendcounts[processID];
}
int process_size = sendcounts[world_rank];
int local_numbers[process_size];
MPI_Scatterv(numbers, sendcounts, displs, MPI_INT, local_numbers, process_size, MPI_INT, 0, MPI_COMM_WORLD);
if(world_rank == world_size - 1){
for(int i = 0; i < size_per_process; i++)
printf("%d ", local_numbers[i]);
printf("\n");
}
MPI_Finalize();
return 0;
}
It is a bit convoluted, but the good news is that it is always the same for this type of distribution.
Explanation of the code:
First we need to understand the parameters of the MPI_Scatterv:
MPI_Scatterv
Scatters a buffer in parts to all processes in a communicator
int MPI_Scatterv(const void *sendbuf, const int *sendcounts, const int *displs, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
Input Parameters
sendbuf address of send buffer (choice, significant only at root)
sendcounts integer array (of length group size) specifying the number of elements to send to each processor
displs integer array (of length group size). Entry i specifies the displacement (relative to sendbuf from which to take the outgoing data to process i
sendtype data type of send buffer elements (handle)
recvcount number of elements in receive buffer (integer)
recvtype data type of receive buffer elements (handle)
root rank of sending process (integer)
comm communicator (handle)
First, we create the array with the number of elements that each process will send:
int sendcounts [world_size];
then we create the array with the displacements:
int displs [world_size];
then we calculate the number of extra iterations:
int res = size % world_size;
then we calculate the number of iterations that each process are guarantee to have:
int size_per_process = size / world_size;
Finally, we calculate the size and the displacement of each process:
int increment = 0;
for(int processID = 0; processID < world_size; processID++){
displs[processID] = increment;
sendcounts[processID] = (processID + 1 <= res) ? size_per_process + 1 : size_per_process;
increment += sendcounts[processID];
}
the line:
sendcounts[processID] = (processID + 1 <= res) ? size_per_process + 1 : size_per_process;
means that we will assign the extra iterations from left to right (i.e., from process rank 0 to rank total of processes - 1).
Upvotes: 3