Marc Youcef
Marc Youcef

Reputation: 69

Subprocess.Popen() hanging even with shell=False and p.communicate()

I am desperate as I am battling since weeks now with subprocess.Popen() hanging. I read a lot of forums with sometime contradictory information and tried as many of them but still without any mean to resolve the issue.

Basically what I simply need is to execute X subprocesses that will run a command and I even do not need their stdout or stderr. They are all writing to sub xml files dedicate to each one of them and I collect them all once they have all finished. The command to be executed is running a python script where I can see it is arriving at the end line in the script logs. I even added a sys.exit() at the end of the called script to make sure it exists properly.

Simplified code looks like:

for cmd in commands:
    p = subprocess.Popen(cmd[0], shell=False)
    processes.append(p)

# We now pause execution on the main process by 'joining' all of our started sub processes.
# This ensures that each has finished processing their own batch.
for index, process in enumerate(processes):
    logging.info(f"Waiting for process #{index}")
    try:
        (out, err) = process.communicate(timeout=3600)
        outputs.append(out) # not even required
        errors.append(err) # not even required
        process.wait()
        logging.info(f"Process #{index} finished")
    except TimeoutExpired as timeout_err:
        logging.exception(f"Timeout Exception from process #{index}", timeout_err)
        process.kill()
        (out, err) = process.communicate()
    # Reading sub xml
    # reading generated xml for the sub process here

Not sure what can be deadlocking here ! even the timeout scenario is not raising properly. And it is unpredictable when it is going to actually hang ...

Any help or support is much appreciated.

Thanks.

EDIT As pointed ou from @MauriceMeyer, the current given timeout is happening sequentially but I have no means to make it start at the time I call subprocess.Popen(). Also this is a very painful workaround as I can't predict exactly the timeout value and it is killing performance while the whole subprocess story was aimed for having better execution performance.

Upvotes: 2

Views: 1055

Answers (1)

Maurice Meyer
Maurice Meyer

Reputation: 18106

You could manually check if a process needs to get killed:

import time
import subprocess

TIMEOUT = 5

processes = []
for x in range(1, 10):
    cmd = [f'sleep {x}']
    p = subprocess.Popen(cmd, shell=True) # shell=True for simplicity!

    # add start time, so we can manually kill the process
    processes.append({'handle': p, 'start': time.time(), 'returncode': None})

_allFinished = False
while not all([p['returncode'] is not None for p in processes]):
    for i, process in enumerate(processes):
        # poll() checks if child process has terminated, returns None if it is running
        rtc = process["handle"].poll()

        if rtc is None:
            # Process is still running, check if it needs to get killed
            if time.time() - process['start'] > TIMEOUT:
                # kills the process!
                process["handle"].terminate()
                process["handle"].wait()
                processes[i]['returncode'] = 'timeout'
        else:
            processes[i]['returncode'] = rtc

    time.sleep(0.1)

for p in processes:
    print(p)

Out:

{'handle': <subprocess.Popen object at 0x10d4d3040>, 'start': 1644318572.923203, 'returncode': 0}
{'handle': <subprocess.Popen object at 0x10d4ee7c0>, 'start': 1644318572.9256551, 'returncode': 0}
{'handle': <subprocess.Popen object at 0x10d550d30>, 'start': 1644318572.928237, 'returncode': 0}
{'handle': <subprocess.Popen object at 0x10d550d90>, 'start': 1644318572.930966, 'returncode': 0}
{'handle': <subprocess.Popen object at 0x10d50fa60>, 'start': 1644318572.934939, 'returncode': 0}
{'handle': <subprocess.Popen object at 0x10d805250>, 'start': 1644318572.937738, 'returncode': 'timeout'}
{'handle': <subprocess.Popen object at 0x10d82f250>, 'start': 1644318572.940351, 'returncode': 'timeout'}
{'handle': <subprocess.Popen object at 0x10d82f370>, 'start': 1644318572.943875, 'returncode': 'timeout'}
{'handle': <subprocess.Popen object at 0x10d82f490>, 'start': 1644318572.946584, 'returncode': 'timeout'}

Upvotes: 1

Related Questions