mr19
mr19

Reputation: 197

Convert bytes embedded in list (or dict) to str for use with json.dumps

I have a function that receives a Python list or dict that may have nested bytes that need to be converted to str before calling json.dumps.

The data structure received by the function is not well-known. It could be a list, could be nested structures, and could have multiple data types. If it was just being passed a byte encoded string a simple decode() would do the trick.

>>> foo = [b'dog', b'cat', b'cow']
>>> foo2 = [f.decode() for f in foo]
>>> foo2
['dog', 'cat', 'cow']`
>>> json.dumps(foo2)
'["dog", "cat", "cow"]'

but in this case we are receiving more complex structures (I truncated traceback for brevity).

>>> foo = [[14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, b'Sue']]
>>> json.dumps(foo)
Traceback (most recent call last):
...
TypeError: Object of type 'bytes' is not JSON serializable

Would like to have a function that can take an arbitrary Python struct (list, dict, etc) and return the same with all bytes decoded.

>>> foo = [[14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, b'Sue']]
>>> foo2 = mydecoder(foo)
>>> foo2
[[14, 3.5, 'Tom'], [18, -1.2, 'Larry'], [22, -1.7, 'Sue']]
>>> json.dumps(foo2)
'[[14, 3.5, "Tom"], [18, -1.2, "Larry"], [22, -1.7, "Sue"]]'

So question is how do you implement a mydecoder function that can take arbitrary lists/dicts, possibly nested, with different types and return the same struct with all bytes decoded?

Upvotes: 0

Views: 1269

Answers (2)

lmielke
lmielke

Reputation: 125

Sorry was a bit to fast first time. Here is how to convert and back:

import json

foo = [[14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, b'Sue']]
print(foo)
# to save
foo = [[i, j, k.decode("utf-8") ] for i, j, k in foo]
json_str = json.dumps(foo)

# to reload
foo = json.loads(json_str)
foo = [[i, j, bytes(k, 'utf-8')] for i, j, k in foo]

print(foo)

Upvotes: 0

Mark
Mark

Reputation: 92461

You can pass a subclass of json.JSONEncoder to json.dumps that can handle special cases of stringifying your bytes. This will let you avoid dealing with nested structures and edges cases which the json encoder already does really well. You just tell it what to do when it sees bytes

Here, you can just handle bytes and let the default encoder do the rest:

import json

class BytesDump(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, bytes):                   # deal with bytes
            return obj.decode() 
        return json.JSONEncoder.default(self, obj)   # everything else

foo = [{"key": b'value'}, [14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, 'Sue']]

json.dumps(foo, cls=BytesDump)

result:

'[{"key": "value"}, [14, 3.5, "Tom"], [18, -1.2, "Larry"], [22, -1.7, "Sue"]]'

Upvotes: 4

Related Questions