MidnightLightning
MidnightLightning

Reputation: 6928

Python bytearray ignoring encoding?

I've got a chunk of code that reads binary data off a string buffer (StringIO object), and tries to convert it to a bytearray object, but it's throwing errors when the value is greater than 127, which the ascii encoding can't handle, even when I'm trying to override it:

file = open(filename, 'r+b')
file.seek(offset)
chunk = file.read(length)
chunk = zlib.decompress(chunk)
chunk = StringIO(chunk)

d = bytearray(chunk.read(10), encoding="iso8859-1", errors="replace")

Running that code gives me:

  d = bytearray(chunk.read(10), encoding="iso8859-1", errors="replace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 3: ordinal not in range(128)

Obviously 240 (decimal of 0xf0) can't fit in the ascii encoding range, but that's why I'm explicitly setting the encoding. But it seems to be ignoring it.

Upvotes: 4

Views: 10231

Answers (2)

kindall
kindall

Reputation: 184220

When converting a string to another encoding, its original encoding is taken to be ASCII if it is a str or Unicode if it is a unicode object. When creating the bytearray, the encoding parameter is required only if the string is unicode. Just don't specify an encoding and you will get the results you want.

Upvotes: 9

Hyperboreus
Hyperboreus

Reputation: 115

I am not quite sure what the problem is.

StringIO is for string IO, not for binary IO. If you want to get a bytearray representing the whole content of the file, use:

with open ('filename', 'r') as file: bytes = bytearray (file.read () )

if you want to get a string with only ascii characters contained in that file, use:

with open ('filename', 'r') as file: asciis = file.read ().decode ('ascii', 'ignore')

(If you run it on windows, you will probably need the binary flag for opening the file.

Upvotes: 2

Related Questions