Reputation: 11
Multiprocess job is running the tasks, I want to stop the rest of the parallel or dependent tasks if one of them fails or completes all the tasks. The problem is with 1st print, where it should check if job failed with non-zero exit code and already not completed then enter the loop and stop the rest of the jobs by breaking the while loop. however, even the execution completed successfully with exit 0, it enters the loop intermittently, stops the rest of the jobs by breaking the loop. What is going wrong here. Failed one enter image description here Passed one enter image description here Main job triggering multiprocess tasks.
def run_block(index):
print index
# do some execution
def run_blocks(target, dict_blocks):
process = []
for (index, (block_id, depend_on)) in \
enumerate(dict_blocks.items()):
proc = multiprocessing.Process(target=run_block, args=index)
process.append(proc)
proc.start()
check_exit(process)
def check_exit(process):
done = False
process_count = len(process)
count = 0
completed = []
while not done:
for proc in process:
if proc.exitcode != 0 and proc.exitcode != None:
print ('1st', proc, count, done, proc.exitcode)
done = True
break
if proc.exitcode == 0 and proc.pid not in completed:
print ('2nd', proc, count, done, proc.exitcode)
completed.append(proc.pid)
count += 1
if count == process_count:
print ('3rd', proc, count, done)
done = True
break
stop_process_exit(process, count, process_count, done)
def stop_process_exit(
process,
count,
process_count,
done,
):
print (process_count, count, done, process)
for proc in process:
if proc.is_alive():
proc.terminate()
if done == True and count != process_count:
exit(1)
Upvotes: 1
Views: 500
Reputation: 10999
Your processes are running independently, so the variable proc.exitcode must be dynamic. In other words, it might change at any moment because the process has just finished. In this statement:
if proc.exitcode != 0 and proc.exitcode != None
you access the variable twice. Suppose proc.exitcode is None when you begin to execute this line. Python does the first comparison and it evaluates True. Now suppose that the process finishes at that exact moment, and now proc.exitcode becomes zero. Python performs the second comparison, and now that is also True! So your print statement fires, and then you break out of the loop when you really don't want to.
Of course I don't know this is what's happening since I can't run your program, but the evidence points that way.
I would change the loop like this:
for proc in process:
if proc.is_alive():
continue
if proc.exitcode != 0:
print ('1st', proc, count, done, proc.exitcode)
done = True
break
# ... everything else is not changed
Upvotes: 0