Reputation: 37
Below is a simple example that freezes because a child process exits without returning anything and the parent keeps waiting forever. Is there a way to timeout a process if it takes too long and let the rest continue? I am a beginner to multiprocessing in python and I find the documentation not very illuminating.
import multiprocessing as mp
import time
def foo(x):
if x == 3:
sys.exit()
#some heavy computation here
return result
if __name__ == '__main__':
pool = mp.Pool(mp.cpu_count)
results = pool.map(foo, [1, 2, 3])
Upvotes: 3
Views: 2832
Reputation: 294
I had the same problem, and this is how I solved it. Maybe there are better solutions, however, it solves also issues not mendioned. E.g. If the process is taking many resources it can happen that a normal termination will take a while to get through to the process -- therefore I use a forceful termination (kill -9
). This part probably only works for Linux, so you may have to adapt the termination, if you are using another OS.
It is part of my own code, so it is probably not copy-pasteable.
from multiprocessing import Process, Queue
import os
import time
timeout_s = 5000 # seconds after which you want to kill the process
queue = Queue() # results can be written in here, if you have return objects
p = Process(target=INTENSIVE_FUNCTION, args=(ARGS_TO_INTENSIVE_FUNCTION, queue))
p.start()
start_time = time.time()
check_interval_s = 5 # regularly check what the process is doing
kill_process = False
finished_work = False
while not kill_process and not finished_work:
time.sleep(check_interval_s)
now = time.time()
runtime = now - start_time
if not p.is_alive():
print("finished work")
finished_work = True
if runtime > timeout_s and not finished_work:
print("prepare killing process")
kill_process = True
if kill_process:
while p.is_alive():
# forcefully kill the process, because often (during heavvy computations) a graceful termination
# can be ignored by a process.
print(f"send SIGKILL signal to process because exceeding {timeout_s} seconds.")
os.system(f"kill -9 {p.pid}")
if p.is_alive():
time.sleep(check_interval_s)
else:
try:
p.join(60) # wait 60 seconds to join the process
RETURN_VALS = queue.get(timeout=60)
except Exception:
# This can happen if a process was killed for other reasons (such as out of memory)
print("Joining the process and receiving results failed, results are set as invalid.")
Upvotes: 2