jared
jared

Reputation: 8981

Capture KeyboardInterrupt in context manager when OpenMPI run is manually terminated

I am running code in parallel using mpi4py. I've noticed that if I run the code and perform a keyboard interrupt, my context manager __exit__ will run if I run the code as python file.py but will not when run as mpirun -np 1 file.py (this is only one process, but it produces the same as if I run the code with more processes).

How do I get the terminated MPI run to cause the code to exit like a normal python process and enter the context manager's __exit__ process?

Minimal reproducible example:

from mpi4py import MPI

def f(i):
    # raise KeyboardInterrupt()
    return i**0.5
    
class ContextManager():
    def __init__(self,):
        return
    
    def __enter__(self,):
        return self
        
    def __exit__(self, exc_type, exc_value, traceback):
        comm = MPI.COMM_WORLD
        rank = comm.Get_rank()
        size = comm.Get_size()
        
        print("Exiting.")
        
        if rank == 0:
            print(rank, size)
            print(exc_type)
            print(exc_value)
            print(traceback)
        

if __name__ == "__main__":
    with ContextManager():
        for i in range(1_000_000):
            print(f(i))

If I uncomment the interrupt, whether the code is run via python file.py or mpirun -np 1 file.py, the context manager __exit__ is run and the printouts are displayed.

If I run the code as shown above (the interrupt is commented) and hit ctrl+C in the middle of the run, then:

Versions:

python: 3.10.14
mpirun: 4.1.4
mpi4py: 3.1.6
Ubuntu: 22.04.4 LTS

Edit: Following the answers in this question, the interrupt is still not captured.

from mpi4py import MPI
import signal

def sigterm_handler(signum, frame):
    print("Here")
    raise KeyboardInterrupt

signal.signal(signal.SIGTERM, sigterm_handler)


def f(i):
    # raise KeyboardInterrupt()
    return i**0.5
    
class ContextManager():
    def __init__(self,):
        return
    
    def __enter__(self,):        
        return self
        
    def __exit__(self, exc_type, exc_value, traceback):   
        comm = MPI.COMM_WORLD
        rank = comm.Get_rank()
        size = comm.Get_size()
        
        print("Exiting.")
        
        if rank == 0:
            print(rank, size)
            print(exc_type)
            print(exc_value)
            print(traceback)
        

if __name__ == "__main__":
    with ContextManager():
        for i in range(1_000_000):
            print(f(i))

Upvotes: 0

Views: 70

Answers (1)

DeepThought42
DeepThought42

Reputation: 185

I think you probably need to put the whole for loop inside a try/except block.

if __name__ == "__main__":
   with ContextManager():
        try:
            for i in range(1_000_000):
                print(f(i))
        except KeyboardInterrupt: pass

I think it is because using some implementations of Python, KeyboardInterrupt (SIGINT) just terminates the program, without exiting with blocks and calling an __exit__() function. With the try/except block, it calls __exit__() because it is catching the KeyboardInterrupt and exiting the with block which calls the __exit__() function. It probably won’t print a traceback, however...

Upvotes: 0

Related Questions