Reputation: 75
I have a problem decoding some characters, the error is like this:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 127: unexpected end of data
Below is my code, 'response' variable is JSON
response = requests.post('LINK-TO-API', headers=headers, data=data)
result = ""
for i in response:
result += i.decode('utf-8')
whats wrong with my code? Thanks
Upvotes: 0
Views: 4453
Reputation: 40713
0xD0
(0b11010000
) is one of many bytes that indicate the start of a multi-byte sequence in UTF-8. The number of 1s before the first 0 indicate the length of the sequence*. The bits after the first 0 are part of the encoding of the code point.
Basically, the iterator of the response has cut a two byte encoding in half. You should read the entire contents of the response before trying to decode it. eg.
bytes_ = b''
for chunk in response:
bytes_ += chunk
result = bytes_.decode('utf8')
* bytes starting 10
indicate a continuation byte in a multi-byte sequence rather than a 1-byte encoding.
Upvotes: 1