Edward Z. Yang
Edward Z. Yang

Reputation: 26742

Python 3: write binary to stdout respecting buffering

There is an existing question How to write binary data to stdout in python 3? but all of the answers suggest sys.stdout.buffer or variants thereof (e.g., manually rewrapping the file descriptor), which have a problem: they don't respect buffering:

MacBook-Pro-116:~ ezyang$ cat test.py
import sys
sys.stdout.write("A")
sys.stdout.buffer.write(b"B")
MacBook-Pro-116:~ ezyang$ python3 test.py | cat
BA

Is there a way to write binary data to stdout while respecting buffering with respect to sys.stdout and unadorned print statements? (The actual use-case is, I have "text-like" data of an unknown encoding and I just want to pass it straight to stdout without making a commitment to a particular encoding.)

Upvotes: 1

Views: 3504

Answers (2)

sleblanc
sleblanc

Reputation: 3921

Can't you interleave calls to write with flush ?

sys.stdout.write("A")

sys.stdout.buffer.write(b"B")

Results in:

BA


sys.stdout.write("A")
sys.stdout.flush()

sys.stdout.buffer.write(b"B")
sys.stdout.flush()

Results in:

AB

Upvotes: 3

kchan
kchan

Reputation: 846

You can define a local function called _print (or even override the system print function by naming it print) as follows:

import sys

def _print(data):
    """
    If data is bytes, write to stdout using sys.stdout.buffer.write,
    otherwise, assume it's str and convert to bytes with utf-8
    encoding before writing.
    """
    if type(data) != bytes:
        data = bytes(data, 'utf-8')
    sys.stdout.buffer.write(data)

_print('A')
_print(b'B')

The output should be AB.

Note: normally the system print function adds a newline to the output. The above _print just outputs the data (either bytes or by assuming it's str) without the newline.

buffered implementation

If you want buffered I/O, you can manage that by using the tools from the io library.

Simple example:

import io
import sys

output_buffer = None
text_wrapper = None

def init_buffer():
    global output_buffer, text_wrapper
    if not output_buffer:
        output_buffer = io.BytesIO()
        text_wrapper = io.TextIOWrapper(
            output_buffer,
            encoding='utf-8',
            write_through=True)

def write(data):
    if type(data) == bytes:
        output_buffer.write(data)
    else:
        text_wrapper.write(data)

def flush():
    sys.stdout.buffer.write(output_buffer.getvalue())

# initialize buffer, write some data, and then flush to stdout
init_buffer()
write("A")
write(b"B")
write("foo")
write(b"bar")
flush()

If you are performing all the output writes in a function, for example, you can use the contextlib.contextmanager to create a factory function that allow you to use the with ... statement:

# This uses the vars and functions in the example above.

import contextlib

@contextlib.contextmanager
def buffered_stdout():
    """
    Create a factory function for using the `with` statement
    to write to the output buffer.
    """
    global output_buffer
    init_buffer()
    fh = sys.stdout.buffer
    try:
        yield fh
    finally:
        try:
            fh.write(output_buffer.getvalue())
        except AttributeError:
            pass


# open the buffered output stream and write some data to it
with buffered_stdout():
    write("A")
    write(b"B")
    write("foo")
    write(b"bar")

See:

Upvotes: 2

Related Questions