Deedee Megadoodoo
Deedee Megadoodoo

Reputation: 863

BytesIO.truncate method does not extend buffer contents

The documentation of IOBase.truncate method says that:

truncate(size=None)

Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled). The new file size is returned.

Changed in version 3.5: Windows will now zero-fill files when extending.

So, taking this into account I suppose that BytesIO (that is a subclass of BufferedIOBase which in turn is a subclass of IOBase) changes its internal buffer size after this method has been called.

But the following code snippet shows that I'm wrong in my assumptions:

from io import BytesIO

# prints b'\x00\x00\x00\x00\x00\x00\x00\x00'
data = BytesIO(8 * b"\x00")
print(data.getvalue())

# prints 16
print(data.truncate(16))

# prints b'\x00\x00\x00\x00\x00\x00\x00\x00'
print(data.getvalue())

# prints b'\x00\x00\x00\x00\x00\x00\x00\x00'
print(bytes(data.getbuffer()))

Where did I turn the wrong way?

Upvotes: 2

Views: 827

Answers (1)

Jean-François Fabre
Jean-François Fabre

Reputation: 140266

checking the source code, it seems that the documentation isn't up to date with BytesIO implementation:

static PyObject *_io_BytesIO_truncate_impl(bytesio *self, Py_ssize_t size)
/*[clinic end generated code: output=9ad17650c15fa09b input=423759dd42d2f7c1]*/
{
    CHECK_CLOSED(self);
    CHECK_EXPORTS(self);

    if (size < 0) {
        PyErr_Format(PyExc_ValueError,    
                     "negative size value %zd", size);
        return NULL;
    }

    if (size < self->string_size) {    
        self->string_size = size;    
        if (resize_buffer(self, size) < 0)    
            return NULL;   
    }

    return PyLong_FromSsize_t(size);

}

the if (size < self->string_size) test ensures that nothing is done if the size is greater than the previous size.

My guess is that for real file handlers, truncate works like the underlying platform (expanding the file), but not with memory-mapped handlers.

The required behaviour can be emulated quite simply by writing at the end of the object if we know that it's going to fail:

def my_truncate(data,size):
    current_size = len(data.getvalue())
    if size < current_size:
        return data.truncate(size)
    elif size == current_size:
        return size  # optim
    else:
        # store current position
        old_pos = data.tell()
        # go to end
        data.seek(current_size)
        # write zeroes
        data.write(b"\x00" * (size-current_size))
        # restore previous file position
        data.seek(old_pos)
        return size

Upvotes: 2

Related Questions