Reputation: 317
I've been learning how to implement MPI over the past couple of weeks and I'm having a very hard time to understand how to set up some of the input arguments for MPI_Allgatherv. I'll use a toy example because I need to take baby steps here. Some of the research I've done is listed at the end of this post (including my previous question, which led me to this question). First, a quick summary of what I'm trying to accomplish:
--Summary-- I'm taking a std::vector A, having multiple processors work on different parts of A, and then taking the updated parts of A and redistributing those updates to all processors. Therefore, all processors start with copies of A, update portions of A, and end with fully updated copies of A.--End--
Let's say I have a std::vector < double > containing 5 elements called "mydata" initialized as follows:
for (int i = 0; i < 5; i++)
{
mydata[i] = (i+1)*1.1;
}
Now let's say I'm running my code on 2 nodes (int tot_proc = 2). I identify the "current" node using "int id_proc," therefore, the root processor has id_proc = 0. Since the number of elements in mydata is odd, I cannot evenly distribute the work between processors. Let's say that I always break the work up as follows:
if (id_proc < tot_proc - 1)
{
//handle mydata.size()/tot_proc elements
}
else
{
//handle whatever is left over
}
In this example, that means: id_proc = 0 will work on mydata[0] and mydata[1] (2 elements, since 5/2 = 2) … and … id_proc = 1 will work on mydata[2] - mydata[4] (3 elements, since 5/2 + 5%2 = 3)
Once each processor has worked on their respective portions of mydata, I want to use Allgatherv to merge the results together so that mydata on each processor contains all of the updated values. We know Allgatherv takes 8 arguments: (1) the starting address of the elements/data being sent, (2) the number of elements being sent, (3) the type of data being sent, which is MPI_DOUBLE in this example, (4) the address of the location you want the data to be received (no mention of "starting" address), (5) the number of elements being received, (6) the "displacements" in memory relative to the receiving location in argument #4, (7) the type of data being received, again, MPI_DOUBLE, and (8) the communicator you're using, which in my case is simply MPI_COMM_WORLD.
Now here's where the confusion begins. Since processor 0 worked on the first two elements, and processor 1 worked on the last 3 elements, then processor 0 will need to SEND the first two elements, and processor 1 will need to SEND the last 3 elements. To me, this suggests that the first two arguments of Allgatherv should be:
Processor 0: MPI_Allgatherv(&mydata[0],2,…
Processor 1: MPI_Allgatherv(&mydata[2],3,…
(Q1) Am I right about that? If so, my next question is in regard to the format of argument 2. Let's say I create a std::vector < int > sendcount such that sendcount[0] = 2, and sendcount[1] = 3.
(Q2) Does Argument 2 require the reference to the first location of sendcount, or do I need to send the reference to the location relevant to each processor? In other words, which of these should I do:
Q2 - OPTION 1
Processor 0: MPI_Allgatherv(&mydata[0], &sendcount[0],…
Processor 1: MPI_Allgatherv(&mydata[2], &sendcount[0],…
Q2 - OPTION 2
Processor 0: MPI_Allgatherv(&mydata[0], &sendcount[id_proc], … (here id_proc = 0)
Processor 1: MPI_Allgatherv(&mydata[2], &sendcount[id_proc], … (here id_proc = 1)
...On to Argument 4. Since I am collecting different sections of mydata back into itself, I suspect that this argument will look similar to Argument 1. i.e. it should be something like &mydata[?]. (Q3) Can this argument simply be a reference to the beginning of mydata (i.e. &mydata[0]), or do I have to change the index the way I did for Argument 1? (Q4) Imagine I had used 3 processors. This would mean that Processor 1 would be sending mydata[2] and mydata[3] which are in "the middle" of the vector. Since the vector's elements are contiguous, then the data that Processor 1 is receiving has to be split (some goes before, and mydata[4] goes after). Do I have to account for that split in this argument, and if so, how?
...Slightly more confusing to me is Argument 5 but I had an idea this morning. Using the toy example: if Processor 0 is sending 2 elements, then it will be receiving 3, correct? Similarly, if Processor 1 is sending 3 elements, then it is receiving 2. (Q5) So, if I were to create a std::vector < int > recvcount, couldn't I just initialize it as:
for (int i = 0; i < tot_proc; i++)
{
recvcount[i] = mydata.size() - sendcount[i];
}
And if that is true, then do I pass it to Allgatherv as &recvcount[0] or &recvcount[id_proc] (similar to Argument 2)?
Finally, Argument 6. I know this is tied to my input for Argument 4. My guess is the following: if I were to pass &mydata[0] as Argument 4 on all processors, then the displacements are the number of positions in memory that I need to move in order to get to the first location where data actually needs to be received. For example,
Processor 0: MPI_Allgatherv( … , &mydata[0], … , 2, … );
Processor 1: MPI_Allgatherv( … , &mydata[0], … , 0, … );
(Q5) Am I right in thinking that the above two lines means "Processor 0 will receive data beginning at location &mydata[0+2]. Processor 1 will receive data beginning at location &mydata[0+0]." ?? And what happens when the data needs to be split like in Q4? Finally, since I am collecting portions of a vector back into itself (replacing mydata with updated mydata by overwriting it), then this tells me that all processors other than the root process will be receiving data beginning at &mydata[0]. (Q6) If this is true, then shouldn't the displacements be 0 for all processors that are not the root?
Some of the links I've read: Difference between MPI_allgather and MPI_allgatherv Difference between MPI_Allgather and MPI_Alltoall functions? Problem with MPI_Gatherv for std::vector C++: Using MPI's gatherv to concatenate vectors of differing lengths http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Allgatherv.html https://computing.llnl.gov/tutorials/mpi/#Routine_Arguments
My previous post on stackoverflow: MPI C++ matrix addition, function arguments, and function returns
Most tutorials, etc, that I've read just gloss over Allgatherv.
Upvotes: 4
Views: 5172
Reputation: 50927
Part of confusion here is that you're trying to do an in-place gather; you are trying to send from and receive into the same array. If you're doing that, you should use the MPI_IN_PLACE option, in which case you don't explicitly specify the send location or count. Those are there for if you are sending from a different buffer than you're receiving into, but in-place gathers are somewhat more constrained.
So this works:
#include <iostream>
#include <vector>
#include <mpi.h>
int main(int argc, char **argv) {
int size, rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (size < 2) {
std::cerr << "This demo requires at least 2 procs." << std::endl;
MPI_Finalize();
return 1;
}
int datasize = 2*size + 1;
std::vector<int> data(datasize);
/* break up the elements */
int *counts = new int[size];
int *disps = new int[size];
int pertask = datasize/size;
for (int i=0; i<size-1; i++)
counts[i] = pertask;
counts[size-1] = datasize - pertask*(size-1);
disps[0] = 0;
for (int i=1; i<size; i++)
disps[i] = disps[i-1] + counts[i-1];
int mystart = disps[rank];
int mycount = counts[rank];
int myend = mystart + mycount - 1;
/* everyone initialize our data */
for (int i=mystart; i<=myend; i++)
data[i] = 0;
int nsteps = size;
for (int step = 0; step < nsteps; step++ ) {
for (int i=mystart; i<=myend; i++)
data[i] += rank;
MPI_Allgatherv(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL,
&(data[0]), counts, disps, MPI_INT, MPI_COMM_WORLD);
if (rank == step) {
std::cout << "Rank " << rank << " has array: [";
for (int i=0; i<datasize-1; i++)
std::cout << data[i] << ", ";
std::cout << data[datasize-1] << "]" << std::endl;
}
}
delete [] disps;
delete [] counts;
MPI_Finalize();
return 0;
}
Running gives
$ mpirun -np 3 ./allgatherv
Rank 0 has array: [0, 0, 1, 1, 2, 2, 2]
Rank 1 has array: [0, 0, 2, 2, 4, 4, 4]
Rank 2 has array: [0, 0, 3, 3, 6, 6, 6]
Upvotes: 4