SurenNihalani
SurenNihalani

Reputation: 1438

MPI_AllToAllV leads to MPI_ERR_TRUNCATE

I have the following MPI_AllToAllv call. All of the variables are vectors

MPI_Alltoallv(
        &elements[0],
        &send_counts[0],
        &send_displacements[0],
        MPI_INT,
        &receiving_vector[0],
        &receiving_counts[0],
        &receiving_displacements[0],
        MPI_INT,
        MPI_COMM_WORLD
         );

Here are the contents of the vectors:

Elements : [6, 5, 4, ]
 @ 0
Elements : [3, 2, 1, ]
 @ 1
send_counts : [3, 0, ]
 @ 1
send_displacements : [0, 3, ]
 @ 1
 receiving_vector  : [0, 0, 0, ]
 @ 0
elements : [6, 5, 4, ]
 @ 0
send_counts : [0, 3, ]
 @ 0
send_displacements : [0, 0, ]
 @ 0
 receiving_vector  : [0, 0, 0, ]
 @ 1
receiving_counts : [0, 3, ]
 @ 1
receiving_displacements : [0, 0, ]
 @ 1
[lawn-143-215-98-238:1182] *** An error occurred in MPI_Alltoallv
[lawn-143-215-98-238:1182] *** reported by process [2332229633,0]
[lawn-143-215-98-238:1182] *** on communicator MPI_COMM_WORLD
[lawn-143-215-98-238:1182] *** MPI_ERR_TRUNCATE: message truncated
[lawn-143-215-98-238:1182] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[lawn-143-215-98-238:1182] ***    and potentially your MPI job)
receiving_counts : [3, 0, ]
 @ 0
receiving_displacements : [0, 0, ]
 @ 0

I don't understand why I am getting this error. Any help would be deeply appreciated.

I have googled this error and it's probably my receiving vector's size but I have tried many sizes and haven't gotten anywhere.

Upvotes: 2

Views: 858

Answers (1)

Hristo Iliev
Hristo Iliev

Reputation: 74475

There is a mismatch in the amount of data send and the amount of data received. Since you only have two ranks, it is easy to draw a table of who send how much and to whom. Each row of the table is the content of send_counts[] at the corresponding rank:

      receiver
s    | 0 | 1 |
e ---+---+---+
n  0 | 0 | 3 |  (send_counts[] @ 0)
d ---+---+---+
e  1 | 3 | 0 |  (send_counts[] @ 1)
r ---+---+---+

To match the amount of data sent, the receive counts at each rank should be equal to the column-vector from the table above that corresponds to that rank:

  • receiving_counts[] @ 0 should be { 0, 3 } while you have [3, 0, ];

  • receiving_counts[] @ 1 should be { 3, 0 } while you have [0, 3, ].

Hence the truncation error.

Upvotes: 2

Related Questions