Reputation: 8981
I am running code in parallel using mpi4py
. I've noticed that if I run the code and perform a keyboard interrupt, my context manager __exit__
will run if I run the code as python file.py
but will not when run as mpirun -np 1 file.py
(this is only one process, but it produces the same as if I run the code with more processes).
How do I get the terminated MPI run to cause the code to exit like a normal python process and enter the context manager's __exit__
process?
Minimal reproducible example:
from mpi4py import MPI
def f(i):
# raise KeyboardInterrupt()
return i**0.5
class ContextManager():
def __init__(self,):
return
def __enter__(self,):
return self
def __exit__(self, exc_type, exc_value, traceback):
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print("Exiting.")
if rank == 0:
print(rank, size)
print(exc_type)
print(exc_value)
print(traceback)
if __name__ == "__main__":
with ContextManager():
for i in range(1_000_000):
print(f(i))
If I uncomment the interrupt, whether the code is run via python file.py
or mpirun -np 1 file.py
, the context manager __exit__
is run and the printouts are displayed.
If I run the code as shown above (the interrupt is commented) and hit ctrl+C
in the middle of the run, then:
python file.py
, __exit__
is entered and the printouts are displayedmpirun -np 1 python file.py
, the code simply terminates and never enters __exit__
(i.e., there are no printouts).Versions:
python: 3.10.14
mpirun: 4.1.4
mpi4py: 3.1.6
Ubuntu: 22.04.4 LTS
Edit: Following the answers in this question, the interrupt is still not captured.
from mpi4py import MPI
import signal
def sigterm_handler(signum, frame):
print("Here")
raise KeyboardInterrupt
signal.signal(signal.SIGTERM, sigterm_handler)
def f(i):
# raise KeyboardInterrupt()
return i**0.5
class ContextManager():
def __init__(self,):
return
def __enter__(self,):
return self
def __exit__(self, exc_type, exc_value, traceback):
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print("Exiting.")
if rank == 0:
print(rank, size)
print(exc_type)
print(exc_value)
print(traceback)
if __name__ == "__main__":
with ContextManager():
for i in range(1_000_000):
print(f(i))
Upvotes: 0
Views: 70
Reputation: 185
I think you probably need to put the whole for
loop inside a try/except
block.
if __name__ == "__main__":
with ContextManager():
try:
for i in range(1_000_000):
print(f(i))
except KeyboardInterrupt: pass
I think it is because using some implementations of Python, KeyboardInterrupt
(SIGINT) just terminates the program, without exiting with
blocks and calling an __exit__()
function.
With the try/except
block, it calls __exit__()
because it is catching the KeyboardInterrupt
and exiting the with
block which calls the __exit__()
function. It probably won’t print a traceback, however...
Upvotes: 0