Sam
Sam

Reputation: 133

I found myself cannot understood the parameter "recvcounts" of MPI_Gatherv

MPI_Gatherv is an interface of MPI like this:

int MPI_Gatherv(
    void* sendbuf,
    int sendcount,
    MPI_Datatype sendtype,
    void* recvbuf,
    int *recvcounts,
    int *displs,
    MPI_Datatype recvtype,
    int root,
    MPI_Comm comm)

the type of "recvcounts" is "int *" so that we can set the count of items to be received for each process respectively; however I found it's impossible to achieve this:

when recvcounts[i] < sendcount, the root process will receive only sendcount items;

when recvcounts[i] > sendcount, the program will crash, the error message is sth like this:

Fatal error in PMPI_Gatherv: Message truncated, error stack:
PMPI_Gatherv(386).....: MPI_Gatherv failed(sbuf=0012FD34, scount=2, MPI_CHAR, rbuf=0012FCC8, rcnts=0012FB30, displs=0012F998, MPI_CHAR, root=0, MPI_COMM_WORLD) failed
MPIR_Gatherv_impl(199):
MPIR_Gatherv(103).....:
MPIR_Localcopy(332)...: Message truncated; 2 bytes received but buffer size is 1

So it means the root have to receive a fixed number of items from each process and the parameter recvcount is meaningless? Or i misunderstood sth?

here is my code:

#include <mpi.h>
#include <iostream>

int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    int n, id;
    MPI_Comm_size(MPI_COMM_WORLD, &n);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);

    char x[100], y[100];
    memset(x, '0' + id, sizeof(x));
    memset(y, '%', sizeof(y));
    int cnts[100], offs[100] = {0};
    for (int i = 0; i < n; i++)
    {
        cnts[i] = i + 1;
        if (i > 0)
        {
            offs[i] = offs[i - 1] + cnts[i - 1];
        }
    }
    MPI_Gatherv(x, 1, MPI_CHAR, y, cnts, offs, MPI_CHAR, 0, MPI_COMM_WORLD);    // receive only 1 item from each process
    //MPI_Gatherv(x, 2, MPI_CHAR, y, cnts, offs, MPI_CHAR, 0, MPI_COMM_WORLD);    // crash
    if (id == 0)
    {
        printf("Gatherv:\n");
        for (int i = 0; i < 100; i++)
        {
            printf("%c ", y[i]);
        }
        printf("\n");
    }

    MPI_Finalize();

    return 0;
}

Upvotes: 1

Views: 1246

Answers (1)

Jonathan Dursi
Jonathan Dursi

Reputation: 50927

As @Alexander Molodih points out, sendcount=recvcount, sendtype=recvtype will always work; but when you start creating your own MPI types, you often have different send and receive types, and that's why recvcount might differ from sendcount.

As an example, take a look at the recently asked MPI partition matrix into blocks ; there a 2 dimensional array is being decomposed into blocks and scatterv'ed. There the send type (which has to pick out only the necessary data from the global array) and the receive type (which is just a continuous block of data) are different, and so are the counts.

That's the general reason why send and receive types and counts are different, in things like sendrecv, gather/scatter, or any other operation where both sending and recieving occur.

In your gatherv case, each process might have its own different sendcount, but the recvcount[] array has to be a list of all those counts so that the receiver can properly place the received data. If you didn't know those values before hand, (each rank only knew its own count, cnts[id]) you could do a gather first:

MPI_Gather(&(cnts[id]), 1, MPI_INT, cnts, 1, MPI_INT, 0, MPI_COMM_WORLD):
for (int i = 1; i < n; i++) { 
    offs[i] = offs[i - 1] + cnts[i - 1];
}
MPI_Gatherv(x, cnts[id], MPI_CHAR, y, cnts, offs, MPI_CHAR, 0, MPI_COMM_WORLD);   

Upvotes: 2

Related Questions