user1124109
user1124109

Reputation: 37

Python multiprocessing: kill process if it is taking too long to return

Below is a simple example that freezes because a child process exits without returning anything and the parent keeps waiting forever. Is there a way to timeout a process if it takes too long and let the rest continue? I am a beginner to multiprocessing in python and I find the documentation not very illuminating.

import multiprocessing as mp
import time

def foo(x):
    if x == 3:
        sys.exit()
    #some heavy computation here
    return result

if __name__ == '__main__':  
    pool = mp.Pool(mp.cpu_count)
    results = pool.map(foo, [1, 2, 3])

Upvotes: 3

Views: 2832

Answers (1)

no_use123
no_use123

Reputation: 294

I had the same problem, and this is how I solved it. Maybe there are better solutions, however, it solves also issues not mendioned. E.g. If the process is taking many resources it can happen that a normal termination will take a while to get through to the process -- therefore I use a forceful termination (kill -9). This part probably only works for Linux, so you may have to adapt the termination, if you are using another OS.

It is part of my own code, so it is probably not copy-pasteable.

from multiprocessing import Process, Queue
import os 
import time 

timeout_s = 5000 # seconds after which you want to kill the process

queue = Queue()  # results can be written in here, if you have return objects

p = Process(target=INTENSIVE_FUNCTION, args=(ARGS_TO_INTENSIVE_FUNCTION, queue))
p.start()

start_time = time.time()
check_interval_s = 5  # regularly check what the process is doing

kill_process = False
finished_work = False

while not kill_process and not finished_work:
    time.sleep(check_interval_s)  
    now = time.time()
    runtime = now - start_time

    if not p.is_alive():
        print("finished work")
        finished_work = True

    if runtime > timeout_s and not finished_work:
        print("prepare killing process")
        kill_process = True

if kill_process:
    while p.is_alive():
        # forcefully kill the process, because often (during heavvy computations) a graceful termination
        # can be ignored by a process.
        print(f"send SIGKILL signal to process because exceeding {timeout_s} seconds.")
        os.system(f"kill -9 {p.pid}")

        if p.is_alive():
            time.sleep(check_interval_s)
else:
    try:
        p.join(60)  # wait 60 seconds to join the process
        RETURN_VALS = queue.get(timeout=60)
    except Exception:
        # This can happen if a process was killed for other reasons (such as out of memory)
        print("Joining the process and receiving results failed, results are set as invalid.")

Upvotes: 2

Related Questions