Decode byte with UTF-8

Question

I am currently querying a kdb system and it is returning the data in bytes. Specifically in one column, I am getting a byte object that looks likes this

b'US $ to UK \xa3 (TTF)'

If I want to decode the string version of this, I can do the following and this works:

result = 'US $ to UK \xa3 (TTF)'.encode().decode()

But I couldn't figure out a way to decode the byte object, any suggestions?

I've tried

b'US $ to UK \xa3 (TTF)'.decode()

but this gives an exception as the \xa3 is not encoded yet, is there a way to convert this byte object into a string literal without decoding?

Ulrich Eckhardt · Accepted Answer

The encoding of that string seems to be ISO-8859-1 (a.k.a. Latin-1), not UTF-8. Once you decode the string correctly, you will be able to work with it or encode it to some other encoding like UTF-8.

raw = b'US $ to UK \xa3 (TTF)'
text = raw.decode('ISO-8859-1')

Decode byte with UTF-8

Answers (1)

Related Questions