Marcin
Marcin

Reputation: 49826

Python: File doesn't read whole file, io.FileIO does - why?

The following code, executed in python 2.7.2 on windows, only reads in a fraction of the underlying file:

import os

in_file = open(os.path.join(settings.BASEPATH,'CompanyName.docx'))
incontent = in_file.read()
in_file.close()

while this code works just fine:

import io
import os

in_file = io.FileIO(os.path.join(settings.BASEPATH,'CompanyName.docx'))
incontent = in_file.read()
in_file.close()

Why the difference? From my reading of the docs, they should perform identically.

Upvotes: 7

Views: 4272

Answers (1)

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

You need to open the file in binary mode, or the read() will stop at the first EOF character it finds. And a docx is a ZIP file which is guaranteed to contain such a character somewhere.

Try

in_file = open(os.path.join(settings.BASEPATH,'CompanyName.docx'), "rb")

FileIO reads raw bytestreams and those are "binary" by default.

Upvotes: 13

Related Questions