Reputation: 115
I'm new to MPI and want to measure the communication cost of MPI_Send and MPI_Recv between two nodes. I have written the following code for this purpose:
/*==============================================================
* print_elapsed (prints timing statistics)
*==============================================================*/
void print_elapsed(const char* desc, struct timeval* start, struct timeval* end, int numiterations) {
struct timeval elapsed;
/* calculate elapsed time */
if(start->tv_usec > end->tv_usec) {
end->tv_usec += 1000000;
end->tv_sec--;
}
elapsed.tv_usec = end->tv_usec - start->tv_usec;
elapsed.tv_sec = end->tv_sec - start->tv_sec;
printf("\n%s total elapsed time = %ld (usec)\n",
desc, (elapsed.tv_sec*1000000 + elapsed.tv_usec)/numiterations );
}
int main(int argc, char **argv) {
int nprocs, nElements; /* command line args */
int my_id;
long double* buffer, *rec_buffer;
/* gettimeofday stuff */
struct timeval start, end; /* gettimeofday stuff */
struct timezone tzp;
MPI_Status status; /* Status variable for MPI operations */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_id); /* Getting the ID for this process */
/*---------------------------------------------------------
* Read Command Line
* - check usage and parse args
*---------------------------------------------------------*/
if(argc < 2) {
if(my_id == 0)
printf("Usage: %s [nElements]\n\n", argv[0]);
MPI_Finalize();
exit(1);
}
nElements = atoi(argv[1]);
int numiterations = 64;
MPI_Comm_size(MPI_COMM_WORLD, &nprocs); /* Get number of processors */
if(my_id == 0)
printf("\nExecuting %s: numElements=%d \n",
argv[0], nElements);
buffer = (long double *) malloc(sizeof(long double)*nElements);
rec_buffer = (long double *) malloc(sizeof(long double)*nElements);
if(buffer == NULL) {
printf("Processor %d - unable to malloc()\n", my_id);
MPI_Finalize();
exit(1);
}
MPI_Barrier(MPI_COMM_WORLD);
if(my_id == 1)
gettimeofday(&start, &tzp);
for(int i = 0 ; i < numiterations ; ++i)
{
if(my_id == 0)
MPI_Send(buffer, nElements, MPI_LONG, 1, 0, MPI_COMM_WORLD);
if(my_id == 1)
MPI_Recv(rec_buffer, nElements, MPI_LONG, 0, 0, MPI_COMM_WORLD, &status);
}
if(my_id == 1) {
gettimeofday(&end,&tzp);
}
MPI_Barrier(MPI_COMM_WORLD);
if(my_id == 1) {
print_elapsed("Summation", &start, &end, numiterations);
}
free(buffer);
MPI_Finalize();
return 0;
} /* main() */
I repeat send and receive for numiteration
times, but I have no information about initialization cost and real communication time. I was wondering if there are any better methods or tools to measure the communication cost with more detail.
Upvotes: 2
Views: 1202
Reputation: 9072
If you want the level of detail you describe, you'll probably have to get down into the implementation of the MPI library itself. What you're measuring is the impact of communication on your application. However, it's possible that there's more involved in the communication than that depending on your infrastructure. Some networks can make progress without involving the application and some MPI libraries can use a thread to asynchronously make progress on messages as well.
How you measure these things will depend on your system and the above constraints. If all you care about is how much time your application is spending blocking in communication calls, then you more or less have accomplished that. You could use other tracing tools to accomplish similar things (HPCtoolkit is one that comes to mind that I've used in the past).
If you want to get more detailed information about what's going on under the hood, you'll have to poke around inside your implementation and start instrumenting internally (assuming you're using an open source implementation such as MPICH or Open MPI). This is a much more involved process and the mechanisms will change from one implementation to another.
Upvotes: 4