João Pedro
João Pedro

Reputation: 13

Reading a binary file as plain text using Python

A friend of mine has written simple poetry using C's fprintf function. It was written using the 'wb' option so the generated file is in binary. I'd like to use Python to show the poetry in plain text.

What I'm currently getting are lots of strings like this: ��������

The code I am using:

with open("read-me-if-you-can.bin", "rb") as f:
      print f.read()

f.close()

Upvotes: 0

Views: 11898

Answers (1)

Irmen de Jong
Irmen de Jong

Reputation: 2847

The thing is, when dealing with text written to a file, you have to know (or correctly guess) the character encoding used when writing said file. If the program reading the file is assuming the wrong encoding here, you will end up with strange characters in the text if you're lucky and with utter garbage if you're unlucky.

Don't try to guess, try to know: you need to ask your friend in what character encoding he or she wrote the poetry text to the file. You then have to open the file in Python specifying that character encoding. Let's say his/her answer is "UTF-16-LE" (for sake of example), you then write:

with open("poetry.bin", encoding="utf-16-le") as f:
    print(f.read())

It seems you're on Python 2 still though, so there you write:

import io
with io.open("poetry.bin", encoding="utf-16-le") as f:
    print f.read()

You could start by trying UTF-8 first though, that is an often used encoding.

Upvotes: 1

Related Questions