Reputation: 706
I have to validate state of some data in a MPI program. The program will run on a super computer with a distributed memory system. A quick research about C Standard's assert function revealed that assert internally uses abort() function for program termination. I haven't found much information as to how abort() works on a multi-process program, especially in MPI's context - it is very different than POSIX enviroment. Does abort() only terminate the process in which it is called, or can it terminate all the processes?
And finally how would I really terminate all processes of a MPI program when a condition fails? Is there a built-in assert in MPI library?
Upvotes: 2
Views: 512
Reputation: 8395
abort()
only terminates the MPI task that invokes it.
It is very likely that will be detected by mpirun
and/or the resource manager, and kill all the MPI job (e.g. all the MPI tasks on all nodes) after that.
That being said, this is library/system dependent, and you should double check that first.
The right way to terminate a MPI job is to
MPI_Abort(MPI_COMM_WORLD, errorcode)
errorcode
is an int
and is generally assigned a strictly positive value.
Upvotes: 3