Leopd
Leopd

Reputation: 42769

Detecting the end of the stream on popen.stdout.readline

I have a python program which launches subprocesses using Popen and consumes their output nearly real-time as it is produced. The code of the relevant loop is:

def run(self, output_consumer):
    self.prepare_to_run()
    popen_args = self.get_popen_args()
    logging.debug("Calling popen with arguments %s" % popen_args)
    self.popen = subprocess.Popen(**popen_args)
    while True:
        outdata = self.popen.stdout.readline()
        if not outdata and self.popen.returncode is not None:
            # Terminate when we've read all the output and the returncode is set
            break
        output_consumer.process_output(outdata)
        self.popen.poll()  # updates returncode so we can exit the loop
    output_consumer.finish(self.popen.returncode)
    self.post_run()

def get_popen_args(self):
    return {
        'args': self.command,
        'shell': False, # Just being explicit for security's sake
        'bufsize': 0,   # More likely to see what's being printed as it happens
                        # Not guarantted since the process itself might buffer its output
                        # run `python -u` to unbuffer output of a python processes
        'cwd': self.get_cwd(),
        'env': self.get_environment(),
        'stdout': subprocess.PIPE,
        'stderr': subprocess.STDOUT,
        'close_fds': True,  # Doesn't seem to matter
    }

This works great on my production machines, but on my dev machine, the call to .readline() hangs when certain subprocesses complete. That is, it will successfully process all of the output, including the final output line saying "process complete", but then will again poll readline and never return. This method exits properly on the dev machine for most of the sub-processes I call, but consistently fails to exit for one complex bash script that itself calls many sub-processes.

It's worth noting that popen.returncode gets set to a non-None (usually 0) value many lines before the end of the output. So I can't just break out of the loop when that is set or else I lose everything that gets spat out at the end of the process and is still buffered waiting for reading. The problem is that when I'm flushing the buffer at that point, I can't tell when I'm at the end because the last call to readline() hangs. Calling read() also hangs. Calling read(1) gets me every last character out, but also hangs after the final line. popen.stdout.closed is always False. How can I tell when I'm at the end?

All systems are running python 2.7.3 on Ubuntu 12.04LTS. FWIW, stderr is being merged with stdout using stderr=subprocess.STDOUT.

Why the difference? Is it failing to close stdout for some reason? Could the sub-sub-processes do something to keep it open somehow? Could it be because I'm launching the process from a terminal on my dev box, but in production it's launched as a daemon through supervisord? Would that change the way the pipes are processed and if so how do I normalize them?

Upvotes: 6

Views: 8136

Answers (6)

hynekcer
hynekcer

Reputation: 15568

I wrote a demo with bash subprocess that can be easy explored. A closed pipe can be recognized by '' in the output from readline(), while the output from an empty line is '\n'.

from subprocess import Popen, PIPE, STDOUT
p = Popen(['bash'], stdout=PIPE, stderr=STDOUT)
out = []
while True:
    outdata = p.stdout.readline()
    if not outdata:
        break
    #output_consumer.process_output(outdata)
    print "* " + repr(outdata)
    out.append(outdata)
print "* closed", repr(out)
print "* returncode", p.wait()

Example of input/output: Closing the pipe distinctly before terminating the process. That is why wait() should be used instead of poll()

[prompt] $ python myscript.py
echo abc
* 'abc\n'
exec 1>&- # close stdout
exec 2>&- # close stderr
* closed ['abc\n']
exit
* returncode 0
[prompt] $

Your code did output a huge number of empty strings for this case.


Example: Fast terminated process without '\n' on the last line:

echo -n abc
exit
* 'abc'
* closed ['abc']
* returncode 0

Upvotes: 0

jdm
jdm

Reputation: 10130

I find that calls to read (or readline) sometimes hang, despite previously calling poll. So I resorted to calling select to find out if there is readable data. However, select without a timeout can hang, too, if the process was closed. So I call select in a semi-busy loop with a tiny timeout for each iteration (see below).

I'm not sure if you can adapt this to readline, as readline might hang if the final \n is missing, or if the process doesn't close its stdout before you close its stdin and/or terminate it. You could wrap this in a generator, and everytime you encounter a \n in stdout_collected, yield the current line.

Also note that in my actual code, I'm using pseudoterminals (pty) to wrap the popen handles (to more closely fake user input) but it should work without.

# handle to read from
handle = self.popen.stdout

# how many seconds to wait without data
timeout = 1

begin = datetime.now()
stdout_collected = ""

while self.popen.poll() is None:
    try:
        fds = select.select([handle], [], [], 0.01)[0]
    except select.error, exc:
        print exc
        break

    if len(fds) == 0:
        # select timed out, no new data
        delta = (datetime.now() - begin).total_seconds()
        if delta > timeout:
            return stdout_collected

        # try longer
        continue
    else:
        # have data, timeout counter resets again
        begin = datetime.now()

    for fd in fds:
        if fd == handle:
            data = os.read(handle, 1024)
            # can handle the bytes as they come in here
            # self._handle_stdout(data)
            stdout_collected += data

# process exited
# if using a pseudoterminal, close the handles here
self.popen.wait()

Upvotes: 1

damzam
damzam

Reputation: 1961

Why are you setting the sdterr to STDOUT?

The real benefit of making a communicate() call on a subproces is that you are able to retrieve a tuple containining the stdout response as well as the stderr meesage.

Those might be useful if the logic depends on their succsss or failure.

Also, it would save you from the pain of having to iterate through lines. Communicate() gives you everything and there would be no unresolved questions about whether or not the full message was received

Upvotes: 0

Martina Ferrari
Martina Ferrari

Reputation: 440

If you use readline() or read(), it should not hang. No need to check returncode or poll(). If it is hanging when you know the process is finished, it is most probably a subprocess keeping your pipe open, as others said before.

There are two things you could do to debug this: * Try to reproduce with a minimal script instead of the current complex one, or * Run that complex script with strace -f -e clone,execve,exit_group and see what is that script starting, and if any process is surviving the main script (check when the main script calls exit_group, if strace is still waiting after that, you have a child still alive).

Upvotes: 2

Aya
Aya

Reputation: 42060

Without knowing the contents of the "one complex bash script" which causes the problem, there's too many possibilities to determine the exact cause.

However, focusing on the fact that you claim it works if you run your Python script under supervisord, then it might be getting stuck if a sub-process is trying to read from stdin, or just behaves differently if stdin is a tty, which (I presume) supervisord will redirect from /dev/null.

This minimal example seems to cope better with cases where my example test.sh runs subprocesses which try to read from stdin...

import os
import subprocess

f = subprocess.Popen(args='./test.sh',
                     shell=False,
                     bufsize=0,
                     stdin=open(os.devnull, 'rb'),
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT,
                     close_fds=True)

while 1:
    s = f.stdout.readline()
    if not s and f.returncode is not None:
        break
    print s.strip()
    f.poll()
print "done %d" % f.returncode

Otherwise, you can always fall back to using a non-blocking read, and bail out when you get your final output line saying "process complete", although it's a bit of a hack.

Upvotes: 2

muudscope
muudscope

Reputation: 7070

The main code loop looks right. It could be that the pipe isn't closing because another process is keeping it open. For example, if script launches a background process that writes to stdout then the pipe will no close. Are you sure no other child process still running?

An idea is to change modes when you see the .returncode has set. Once you know the main process is done, read all its output from buffer, but don't get stuck waiting. You can use select to read from the pipe with a timeout. Set a several seconds timeout and you can clear the buffer without getting stuck waiting child process.

Upvotes: 3

Related Questions