streetCoder3127917
streetCoder3127917

Reputation: 75

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 127: unexpected end of data

I have a problem decoding some characters, the error is like this:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 127: unexpected end of data

Below is my code, 'response' variable is JSON

response = requests.post('LINK-TO-API', headers=headers, data=data)
result = ""
for i in response:
    result += i.decode('utf-8')

whats wrong with my code? Thanks

Upvotes: 0

Views: 4453

Answers (1)

Dunes
Dunes

Reputation: 40713

0xD0 (0b11010000) is one of many bytes that indicate the start of a multi-byte sequence in UTF-8. The number of 1s before the first 0 indicate the length of the sequence*. The bits after the first 0 are part of the encoding of the code point.

Basically, the iterator of the response has cut a two byte encoding in half. You should read the entire contents of the response before trying to decode it. eg.

bytes_ = b''
for chunk in response:
    bytes_ += chunk
result = bytes_.decode('utf8')

* bytes starting 10 indicate a continuation byte in a multi-byte sequence rather than a 1-byte encoding.

Upvotes: 1

Related Questions