Reputation: 12452
child = subprocess.Popen(command,
shell=True,
env=environment,
close_fds=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
stdin=sys.stdin,
preexec_fn=os.setsid
)
child_interrupted = False
while child.poll() is None:
if Signal.isInterrupted():
child_interrupted = True
os.killpg(os.getpgid(child.pid), signal.SIGTERM)
break
time.sleep(0.1)
subout = child.communicate()[0]
logging.info(subout)
the above works for most command it executes (90%) but for some commands it hangs
for those command that repeatedly hangs, if i get rid of the below, it works fine:
child_interrupted = False
while child.poll() is None:
if Signal.isInterrupted():
child_interrupted = True
os.killpg(os.getpgid(child.pid), signal.SIGTERM)
break
time.sleep(0.1)
im assuming for those hanging commands, child.poll() is None
even though the job is finished??
communicate() can tell the process is finished but poll() cant?
i've executed ps -ef
on those processes
and they are defunct only when child.poll()
code is in place
any idea why?
it looks like defunct means "That's a zombie process, it's finished but the parent hasn't wait()ed for it yet." well, im polling to see if i can call wait/communitcate...
Upvotes: 3
Views: 2735
Reputation: 155323
You've set the Popen
object to receive the subprocess's stdout
via pipe. Problem is, you're not reading from that pipe until the process exits. If the process produces enough output to fill the OS level pipe buffers, and you don't drain the pipe, then you're deadlocked; the subprocess wants you to read the output its writing so it can continue to write, then exit, while you're waiting for it to exit before you'll read the output.
If your explicit poll and interrupt checking is necessary, the easiest solution to this deadlock is probably to launch a thread that drains the pipe:
... launch the thread just after Popen called ...
draineddata = []
# Trivial thread just reads lines from stdout into the list
drainerthread = threading.Thread(target=draineddata.extend, args=(child.stdout,))
drainerthread.daemon = True
drainerthread.start()
... then where you had been doing communicate, change it to: ...
child.wait()
drainerthread.join()
subout = b''.join(draineddata) # Combine the data read back to a single output
Upvotes: 3