Wang
Wang

Reputation: 8173

the simplest interface to let subprocess output to both file and stdout/stderr?

I want something have similar effect of cmd > >(tee -a {{ out.log }}) 2> >(tee -a {{ err.log }} >&2) in python subporcess without calling tee. Basically write stdout to both stdout and out.log files and write stderr to both stderr and err.log. I knew I could use a loop to handle it. But since I have lots of Popen, subprocess.run calls in my code already and I do not want to rewrite the entire thing I wonder is there any easier interface provided by some package could just allow me to do something like:

subprocess.run(["ls", "-l"], stdout=some_magic_file_object(sys.stdout, 'out.log'), stderr=some_magic_file_object(sys.stderr, 'out.log') )

Upvotes: 5

Views: 644

Answers (1)

Will Da Silva
Will Da Silva

Reputation: 7040

No simple way as far as I can tell, but here is a way:

import os


class Tee:
    def __init__(self, *files, bufsize=1):
        files = [x.fileno() if hasattr(x, 'fileno') else x for x in files]
        read_fd, write_fd = os.pipe()
        pid = os.fork()
        if pid:
            os.close(read_fd)
            self._fileno = write_fd
            self.child_pid = pid
            return
        os.close(write_fd)
        while buf := os.read(read_fd, bufsize):
            for f in files:
                os.write(f, buf)
        os._exit(0)

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.close()

    def fileno(self):
        return self._fileno

    def close(self):
        os.close(self._fileno)
        os.waitpid(self.child_pid, 0)

This Tee object takes a list of file objects (i.e. objects that either are integer file descriptors, or have a fileno method). It creates a child process that reads from its own fileno (which is what subprocess.run will write to) and writes that content to all of the files it was provided.

There's some lifecycle management needed, as its file descriptor must be closed, and the child process must be waited on afterwards. For that you either have to manage it manually by calling the Tee object's close method, or by using it as a context manager as shown below.

Usage:

import subprocess
import sys


logfile = open('out.log', 'w')
stdout_magic_file_object = Tee(sys.stdout, logfile)
stderr_magic_file_object = Tee(sys.stderr, logfile)

# Use the file objects with as many subprocess calls as you'd like here
subprocess.run(["ls", "-l"], stdout=stdout_magic_file_object, stderr=stderr_magic_file_object)

# Close the files after you're done with them.
stdout_magic_file_object.close()
stderr_magic_file_object.close()
logfile.close()

A cleaner way would be to use context managers, shown below. It would require more refactoring though, so you may prefer manually closing the files instead.

import subprocess
import sys


with open('out.log', 'w') as logfile:
    with Tee(sys.stdout, logfile) as stdout, Tee(sys.stderr, logfile) as stderr:
        subprocess.run(["ls", "-l"], stdout=stdout, stderr=stderr)

One issue with this approach is that the child process writes to stdout immediately, and so Python's own output will often get mixed up in it. You can work around this by using Tee on a temp file and the log file, and then printing the content of the temp file (and deleting it) once the Tee context block is exited. Making a subclass of Tee that does this automatically would be straightforward, but using it would be a bit cumbersome since now you need to exit the context block (or otherwise have it run some code) to print out the output of the subprocess.

Upvotes: 2

Related Questions