ealeon
ealeon

Reputation: 12452

Python subprocess.Popen poll seems to hang but communicate works

child = subprocess.Popen(command,
                        shell=True,
                        env=environment,
                        close_fds=True,
                        stdout=subprocess.PIPE,
                        stderr=subprocess.STDOUT,
                        stdin=sys.stdin,
                        preexec_fn=os.setsid
                       )

child_interrupted = False
while child.poll() is None:
    if Signal.isInterrupted():
        child_interrupted = True
        os.killpg(os.getpgid(child.pid), signal.SIGTERM)
        break
    time.sleep(0.1)

subout = child.communicate()[0]
logging.info(subout)

the above works for most command it executes (90%) but for some commands it hangs

for those command that repeatedly hangs, if i get rid of the below, it works fine:

    child_interrupted = False
    while child.poll() is None:
        if Signal.isInterrupted():
            child_interrupted = True
            os.killpg(os.getpgid(child.pid), signal.SIGTERM)
            break
        time.sleep(0.1)

im assuming for those hanging commands, child.poll() is None even though the job is finished??

communicate() can tell the process is finished but poll() cant?

i've executed ps -ef on those processes
and they are defunct only when child.poll() code is in place
any idea why?

it looks like defunct means "That's a zombie process, it's finished but the parent hasn't wait()ed for it yet." well, im polling to see if i can call wait/communitcate...

Upvotes: 3

Views: 2735

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155323

You've set the Popen object to receive the subprocess's stdout via pipe. Problem is, you're not reading from that pipe until the process exits. If the process produces enough output to fill the OS level pipe buffers, and you don't drain the pipe, then you're deadlocked; the subprocess wants you to read the output its writing so it can continue to write, then exit, while you're waiting for it to exit before you'll read the output.

If your explicit poll and interrupt checking is necessary, the easiest solution to this deadlock is probably to launch a thread that drains the pipe:

... launch the thread just after Popen called ...

draineddata = []
# Trivial thread just reads lines from stdout into the list
drainerthread = threading.Thread(target=draineddata.extend, args=(child.stdout,))
drainerthread.daemon = True
drainerthread.start()

... then where you had been doing communicate, change it to: ...
child.wait()
drainerthread.join()
subout = b''.join(draineddata)  # Combine the data read back to a single output

Upvotes: 3

Related Questions