Reputation: 1052
I have a large Fortran90 application that I'm trying to debug. I'm trying to get all processors to dump their values of a particular variable at a given location in the program, but I only ever get subsets of the processors, sometimes with repeated processor ranks using the following code:
call mpi_barrier(mpi_comm_world, imstat)
do i = 0, nprocs
if (rank == i) print*, rank, ! ... hopefully useful stuff
call mpi_barrier(mpi_comm_world, imstat)
end do
I've had issues with stack corruption in this particular application in the past, so I suspect my problem is a bug like that. But I'm having trouble figuring out if the reason I'm not seeing the value on all processors is that bug or if my dumping code is somehow at fault.
Output the first time I call the routine (for nprocs = 30
) is
rank, ntiltin = 29 1 m rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 24 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
rank, ntiltin = 26 14
And another, the next time I call the same routine:
rank, ntiltin = 20 5
rank, ntiltin = 28 5
rank, ntiltin = 20 5
rank, ntiltin = 20 5
rank, ntiltin = 20 5
It also looks like I'm having output buffering issues, but I compiled using ifort, and a Google search turned up a result (Enable buffered I/O to stdout with Intel ifort compiler) that seemed to indicate that buffering to stdout is not something ifort will do. I am redirecting the output to a file, so perhaps there's some system buffering in there, and that's my problem?
So my question is does that look like a reasonable code snippet to dump out the values of a variable on all processors, or is my lack of sleep catching up with me?
Thanks in advance!
Upvotes: 0
Views: 464
Reputation: 4926
If you are suspecting a stack corruption problem, it's a general idea not to print to standard output. This is more important with MPI, where stdout itself is re-routed to task 0 that perform the actual writes.
As a general suggestion, in this case just let each task open one file, parametrized with its task-id, and let each of them write to the file, and perform a flush after each write.
In order to properly align the files, just print something more, like the iteration numbers, to let you rebuild the proper order in the separated output files.
I know, it's annoying, but sometimes it's needed to spot... annoying problems! Once you have solved it, just get rid of all of it.
If you are dealing with a large project, you can evaluate to use #ifdef
's and the C preprocessors to parametrically enable/disable them at compile time
Upvotes: 1