Convert bytes embedded in list (or dict) to str for use with json.dumps

Question

I have a function that receives a Python list or dict that may have nested bytes that need to be converted to str before calling json.dumps.

The data structure received by the function is not well-known. It could be a list, could be nested structures, and could have multiple data types. If it was just being passed a byte encoded string a simple decode() would do the trick.

>>> foo = [b'dog', b'cat', b'cow']
>>> foo2 = [f.decode() for f in foo]
>>> foo2
['dog', 'cat', 'cow']`
>>> json.dumps(foo2)
'["dog", "cat", "cow"]'

but in this case we are receiving more complex structures (I truncated traceback for brevity).

>>> foo = [[14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, b'Sue']]
>>> json.dumps(foo)
Traceback (most recent call last):
...
TypeError: Object of type 'bytes' is not JSON serializable

Would like to have a function that can take an arbitrary Python struct (list, dict, etc) and return the same with all bytes decoded.

>>> foo = [[14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, b'Sue']]
>>> foo2 = mydecoder(foo)
>>> foo2
[[14, 3.5, 'Tom'], [18, -1.2, 'Larry'], [22, -1.7, 'Sue']]
>>> json.dumps(foo2)
'[[14, 3.5, "Tom"], [18, -1.2, "Larry"], [22, -1.7, "Sue"]]'

So question is how do you implement a mydecoder function that can take arbitrary lists/dicts, possibly nested, with different types and return the same struct with all bytes decoded?

Mark · Accepted Answer

You can pass a subclass of json.JSONEncoder to json.dumps that can handle special cases of stringifying your bytes. This will let you avoid dealing with nested structures and edges cases which the json encoder already does really well. You just tell it what to do when it sees bytes

Here, you can just handle bytes and let the default encoder do the rest:

import json

class BytesDump(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, bytes):                   # deal with bytes
            return obj.decode() 
        return json.JSONEncoder.default(self, obj)   # everything else

foo = [{"key": b'value'}, [14, 3.5, b'Tom'], [18, -1.2, b'Larry'], [22, -1.7, 'Sue']]

json.dumps(foo, cls=BytesDump)

result:

'[{"key": "value"}, [14, 3.5, "Tom"], [18, -1.2, "Larry"], [22, -1.7, "Sue"]]'

Convert bytes embedded in list (or dict) to str for use with json.dumps

Answers (2)

Related Questions