BelowZero
BelowZero

Reputation: 1373

Why is TextIOWrapper closing the given BytesIO stream?

If I run following code in python 3

from io import BytesIO
import csv
from io import TextIOWrapper


def fill_into_stringio(input_io):
    writer = csv.DictWriter(TextIOWrapper(input_io, encoding='utf-8'),fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

with BytesIO() as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)

I get an error:

ValueError: I/O operation on closed file.

While if I don't use the TextIOWrapper the io stream stays open. As an example if I modify my function to

def fill_into_stringio(input_io):
    for i in range(100):
        input_io.write(b'erwfewfwef')

I don't get any errors any more so for some reason TestIOWrapper is closing the stream from which I would like to read afterwards. Is this intended to be like this and whether it is is there a way to achieve what I am trying without writing the csv writer myself?

Upvotes: 9

Views: 5060

Answers (2)

AlexFraser
AlexFraser

Reputation: 107

I was getting the same ValueError: I/O operation on closed file. on my attempt to redirect stdout to a GUI textbox in PyQT5 from subprocesses and threads. A slightly different application, but the same underlying error. Basically the garbage collector is deleting the TextIOWrapper() object when there are no more references to it and it closes the underlying stream as part of this process.

After the function executes, there is no longer a reference in your code to the TextIOWrapper object and so the garbage collector deletes it before input_i.seek(0) is executed. As part of deleting the TextIOWrapper object, the garbage collector closes the buffer that was wrapped. When you got to access the wrapped stream again, the stream will have been closed and raise the error.

Given this behaviour, I think wrapping the stdout buffer with a TextIOWrapper is usually going to be a bad idea, since it is just going to close your original stdout stream once the TextIOWrapper object is deleted by garbage collection.

In my case I subclassed StringIO to make the write method trigger a pyqt signal attached to my other textbox (signals used to allow thread-safe data transfer), since StringIO has it's own in-memory buffer and will not affect an underlying buffer (such as stdout) and accidentally close it during garbage collection. I could probably have subclassed one of the abstract base classes in io instead upon further thought, maybe next time.

class StdOutRedirector(io.StringIO):
    def __init__(self, update_ui: pyqtSignal):
        super().__init__()
        self.update_ui = update_ui

    def write(self, string):
        self.update_ui.emit(string)

try:
    sys.stdout = StdOutRedirector(self.send_text)
    doSomeStuffWithRedirectedStdout()
except Exception as error:
    tell_user("bug happened")
finally:
    sys.stdout = sys.__stdout__

Upvotes: 0

ShadowRanger
ShadowRanger

Reputation: 155506

The csv module is the weird one here; most file-like objects that wrap other objects assume ownership of the object in question, closing it when they themselves are closed (or cleaned up in some other way).

One way to avoid the problem is to explicitly detach from the TextIOWrapper before allowing it to be cleaned up:

def fill_into_stringio(input_io):
    # write_through=True prevents TextIOWrapper from buffering internally;
    # you could replace it with explicit flushes, but you want something 
    # to ensure nothing is left in the TextIOWrapper when you detach
    text_input = TextIOWrapper(input_io, encoding='utf-8', write_through=True)
    try:
        writer = csv.DictWriter(text_input, fieldnames=['ids'])
        for i in range(100):
            writer.writerow({'ids': str(i)})
    finally:
        text_input.detach()  # Detaches input_io so it won't be closed when text_input cleaned up

The only other built-in way to avoid this is for real file objects, where you can pass them a file descriptor and closefd=False and they won't close the underlying file descriptor when close-ed or otherwise cleaned up.

Of course, in your particular case, there is simpler way: Just make your function expect text based file-like objects and use them without rewrapping; your function really shouldn't be responsible for imposing encoding on the caller's output file (what if the caller wanted UTF-16 output?).

Then you can do:

from io import StringIO

def fill_into_stringio(input_io):
    writer = csv.DictWriter(input_io, fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

# newline='' is the Python 3 way to prevent line-ending translation
# while continuing to operate as text, and it's recommended for any file
# used with the csv module
with StringIO(newline='') as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)
    # If you really need UTF-8 bytes as output, you can make a BytesIO at this point with:
    # BytesIO(input_i.getvalue().encode('utf-8'))

Upvotes: 18

Related Questions