Reputation: 23
i have encountered a problem with MPI_Split_comm that seems to occur only if openmpi 1.4.3 is used. Example code:
#include <mpi.h>
#include <cassert>
#include <vector>
const size_t n_test=1000000;
class MyComm{
private:
MPI_Comm comm;
public:
int size,rank;
MyComm(){
comm=MPI_COMM_WORLD;
MPI_Comm_rank(comm,&rank);
MPI_Comm_size(comm,&size);
}
MyComm(const MyComm&);
MyComm(const MyComm& c, int col){
MPI_Comm_split(c.comm,col,c.rank,&comm);
MPI_Comm_size(comm,&size);
MPI_Comm_rank(comm,&rank);
}
~MyComm(){
if(comm!=MPI_COMM_WORLD) MPI_Comm_free(&comm);
}
};
void split(){
std::vector<MyComm*> communicators;
communicators.push_back(new MyComm());
while(communicators.back()->size >1){
int size=communicators.back()->size;
int rank=communicators.back()->rank;
int color= (rank >= size/2) ? 1 : 0;
communicators.push_back(new MyComm(*communicators.back(),color));
if(color==0) assert( communicators.back()->size==(size-size%2)/2 );
else assert( communicators.back()->size==(size+size%2)/2 );
}
for(size_t i=0;i<communicators.size();++i) delete communicators[i];
}
int main(int argc, char** argv){
MPI_Init(&argc,&argv);
for(size_t count=0;count<n_test;++count) split();
MPI_Finalize();
return 0;
}
The problem is that the sizes of the new communicators are not always correct. The problem does only occur for certain numbers of processes, e.g. 7. Furthermore, it does not occur in every execution. I have compiled the code with g++ and icpc (on Ubuntu 12.04, openmpi 1.4.3) and the error occurs in both executables. The error does not occur if openmpi 1.6.5 or 1.8.3 is used. This might look like a bug in openmpi 1.4.3, but since the behaviour of mpi is unspecified in the case of wrong usage, it might also be a problem with the code. So, my questions are:
1) Can anyone find a mistake in my code?
2) Does anyone know about problems with MPI_Comm_split in openmpi 1.4.3 that have been solved in later versions?
(btw: all MPI routines return MPI_SUCCESS)
Upvotes: 2
Views: 529
Reputation: 764
At first glance, your code looks fine.
I'd stick with the later versions of Open MPI because countless bugs have been fixed since the 1.4.x series. Specifically: 1.4.x is so old that it might not even be worth it to spelunk through the records to see if an issue with MPI_COMM_SPLIT was fixed since then.
Upvotes: 2