Bitcoin Earn
Bitcoin Earn

Reputation: 43

How to convert bytes type to str or json?

How to convert that bytes type array to str or json? I have this python byte-code and I need to convert to json format or string format. How can I do that?

b'x\xda\x04\xc0\xb1\r\xc4 \x0c\x85\xe1]\xfe\x9a\x06\xae\xf36\'B\x11\xc9J$?\xbbB\xec\x9eo\xb3"\xde\xc0\x9ero\xc4Ryb\x1b\xe5?K\x18\xaa9\x97\xc4i\xdc\x17\xd6\xc7\xaf\x8f\xf3\x05\x00\x00\xff\xff l\x12l'

Upvotes: 1

Views: 8742

Answers (3)

martineau
martineau

Reputation: 123501

You don't have to convert the binary data using the base64 encoding algorithm nor into a hexadecimal string as @Mark Tolonen suggests in his answer — both of which require substantially more space to represent the data than the original.

Instead you can take advantage of the fact that JSON strings are "a sequence of zero or more Unicode characters" (per the JSON spec) which means different encoding are supported. This means you can "decode" the binary data into latin1 and the "encode" it back to the original binary data.

Here's what I mean:

import json

data = b'x\xda\x04\xc0\xb1\r\xc4 \x0c\x85\xe1]\xfe\x9a\x06\xae\xf36\'B\x11\xc9J$?\xbbB\xec\x9eo\xb3"\xde\xc0\x9ero\xc4Ryb\x1b\xe5?K\x18\xaa9\x97\xc4i\xdc\x17\xd6\xc7\xaf\x8f\xf3\x05\x00\x00\xff\xff l\x12l'

j = {'data': data.decode('latin1')}
s = json.dumps(j)
print(s) # resulting JSON text

# restore back to binary data
j2 = json.loads(s)
data2 = j2['data'].encode('latin1')
assert data2 == data  # Should be identical.

Here's the difference it makes for your sample data:

import base64

print(f"{len(data)}")                                    # -> 67
print(f"{len(data.decode('latin1'))}")                   # -> 67 
print(f"{len(base64.b64encode(data).decode('ascii'))}")  # -> 92 
print(f"{len(data.hex())}")                              # -> 134

✶ Note that I learned about the encoding trick from an answer by @Sven Marnach to a question about serializing binary data long ago (and have used multiple times since).

Upvotes: 0

Mark Tolonen
Mark Tolonen

Reputation: 177971

This looks like random binary data, not encoded text, so one way of storing binary data in JSON is to use base64 encoding. The base64 algorithm ensures all the data elements are printable ASCII characters, but the result is still a bytes object, so .decode('ascii') is used to convert the ASCII bytes to a Unicode str of ASCII characters suitable for use in an object targeted for JSON use.

Example:

import base64
import json

data = b'x\xda\x04\xc0\xb1\r\xc4 \x0c\x85\xe1]\xfe\x9a\x06\xae\xf36\'B\x11\xc9J$?\xbbB\xec\x9eo\xb3"\xde\xc0\x9ero\xc4Ryb\x1b\xe5?K\x18\xaa9\x97\xc4i\xdc\x17\xd6\xc7\xaf\x8f\xf3\x05\x00\x00\xff\xff l\x12l'

j = {'data':base64.b64encode(data).decode('ascii')}
s = json.dumps(j)
print(s) # resulting JSON text

# restore back to binary data
j2 = json.loads(s)
data2 = base64.b64decode(j2['data'])
print(data2 == data)

Output:

{"data": "eNoEwLENxCAMheFd/poGrvM2J0IRyUokP7tC7J5vsyLewJ5yb8RSeWIb5T9LGKo5l8Rp3BfWx6+P8wUAAP//IGwSbA=="}
True

Simpler, but a longer result, is to use data.hex() to get a hexadecimal string representation and bytes.fromhex() to convert that back to bytes:

>>> s = data.hex()
>>> s
'78da04c0b10dc4200c85e15dfe9a06aef336274211c94a243fbb42ec9e6fb322dec09e726fc45279621be53f4b18aa3997c469dc17d6c7af8ff3050000ffff206c126c'
>>> data2 = bytes.fromhex(s)
>>> data2
b'x\xda\x04\xc0\xb1\r\xc4 \x0c\x85\xe1]\xfe\x9a\x06\xae\xf36\'B\x11\xc9J$?\xbbB\xec\x9eo\xb3"\xde\xc0\x9ero\xc4Ryb\x1b\xe5?K\x18\xaa9\x97\xc4i\xdc\x17\xd6\xc7\xaf\x8f\xf3\x05\x00\x00\xff\xff l\x12l'
>>> data2 == data
True

Upvotes: 6

Dennis
Dennis

Reputation: 101

use the decode() method of the bytes object and provide the used encoding as a argument

Upvotes: 1

Related Questions