Sergey
Sergey

Reputation: 21241

How to make a class JSON serializable

How to make a Python class serializable?

class FileItem:
    def __init__(self, fname):
        self.fname = fname

Attempt to serialize to JSON:

>>> import json
>>> x = FileItem('/foo/bar')
>>> json.dumps(x)
TypeError: Object of type 'FileItem' is not JSON serializable

Upvotes: 1484

Views: 1637124

Answers (30)

Jeff Hykin
Jeff Hykin

Reputation: 2637

TLDR: copy-paste Option 1 or Option 2 below

So You Want Python's json module work with Your Class?

  • The good news: Yeah, a reliable solution exists
  • The bad news: No, there is no python "official" solution
    • By official solution, I mean there is no way (as of 2024) to add a method to your class (like toJSON in JavaScript) and/or no way to register your class with the built-in json module. When something like json.dumps([1,2, your_obj]) is executed, python doesn't check a lookup table or object method.
    • I'm not sure why other answers don't explain this
    • The closest official approach is probably andyhasit's answer which is to inherit from a dictionary. However, inheriting from a dictionary doesn't work very well for many custom classes like AdvancedDateTime, or pytorch tensors.
  • The ideal workaround is this:
    • Add def __json__(self) method to your class
    • Mutate json.dumps to check for __json__ method (affects everywhere, even pip modules that import json)
    • Note: Modifing builtin stuff usually isn't great, however this change should have no side effects, even if its applied multiple times by different codebases. It is entirely reversable durning runtime (if a module wants to undo the modification). And for better or worse, is the best that can done at the moment.


Option 1: Let a Module do the Patching


pip install json-fix
(extended + packaged version of Fancy John's answer, thank you @FancyJohn)

your_class_definition.py

import json_fix

class YOUR_CLASS:
    def __json__(self):
        # YOUR CUSTOM CODE HERE
        #    you probably just want to do:
        #        return self.__dict__
        return "a built-in object that is naturally json-able"

Thats it.


Example usage:

from your_class_definition import YOUR_CLASS
import json

json.dumps([1,2, YOUR_CLASS()], indent=0)
# '[\n1,\n2,\n"a built-in object that is naturally json-able"\n]'

To make json.dumps work for Numpy arrays, Pandas DataFrames, and other 3rd party objects, see the Module (only ~2 lines of code but needs explanation).




How does it work? Well...

Option 2: Patch json.dumps yourself


Note: this approach is simplified, it fails on known edgecases (ex: if your custom class inherits from dict or another builtin), and it misses out on controlling the json behavior for external classes (numpy arrays, datetime, dataframes, tensors, etc).

some_file_thats_imported_before_your_class_definitions.py

# Step: 1
# create the patch
from json import JSONEncoder
def wrapped_default(self, obj):
    return getattr(obj.__class__, "__json__", wrapped_default.default)(obj)
wrapped_default.default = JSONEncoder().default
   
# apply the patch
JSONEncoder.original_default = JSONEncoder.default
JSONEncoder.default = wrapped_default

your_class_definition.py

# Step 2
class YOUR_CLASS:
    def __json__(self, **options):
        # YOUR CUSTOM CODE HERE
        #    you probably just want to do:
        #        return self.__dict__
        return "a built-in object that is natually json-able"

_

All other answers seem to be "Best practices/approaches to serializing a custom object"

Which, is alreadly covered here in the docs (search "complex" for an example of encoding complex numbers)

Upvotes: 97

Andrew Hill
Andrew Hill

Reputation: 2030

I had a function that was returning a non-serialisable value that i knew was going to be serialised as its only use.

My solution was to instead return vars(myClass)

def get_data_to_serialise():
  mc= myClass()
  return vars(mc)  # <-- vars basically returns mc.__dict__ instead, which is serialisable

you may need a shim function call, but this solution works almost any class no external code modification, at the cost of stripping the explicit class name from your return type.

it means you would serialise the same as you print if you've defined it as def __str__(self ): return self.__dict__.__str__()

Upvotes: 0

Onur Yıldırım
Onur Yıldırım

Reputation: 33674

Here is a simple solution for a simple feature:

.toJSON() Method

Instead of a JSON serializable class, implement a serializer method:

import json

class Object:
    def toJSON(self):
        return json.dumps(
            self,
            default=lambda o: o.__dict__, 
            sort_keys=True,
            indent=4)

So you just call it to serialize:

me = Object()
me.name = "Onur"
me.age = 35
me.dog = Object()
me.dog.name = "Apollo"

print(me.toJSON())

will output:

{
    "age": 35,
    "dog": {
        "name": "Apollo"
    },
    "name": "Onur"
}

For a fully-featured library, you can use orjson.

Upvotes: 872

Paulo Freitas
Paulo Freitas

Reputation: 13649

Another option is to wrap JSON dumping in its own class:

import json

class FileItem:
    def __init__(self, fname: str) -> None:
        self.fname = fname

    def __repr__(self) -> str:
        return json.dumps(self.__dict__)

Or, even better, subclassing FileItem class from a JsonSerializable protocol class:

import json
from typing import Protocol

class JsonSerializable(Protocol):
    def to_json(self) -> str:
        return json.dumps(self.__dict__)

    def __repr__(self) -> str:
        return self.to_json()


class FileItem(JsonSerializable):
    def __init__(self, fname: str) -> None:
        self.fname = fname

Testing:

>>> f = FileItem('/foo/bar')
>>> f.to_json()
'{"fname": "/foo/bar"}'
>>> f
'{"fname": "/foo/bar"}'
>>> str(f) # string coercion
'{"fname": "/foo/bar"}'

Upvotes: 44

Wizard.Ritvik
Wizard.Ritvik

Reputation: 11670

To throw yet another log into a 10-year old fire, I would also offer the dataclass-wizard for this task, assuming you're using Python 3.6+. This works well with dataclasses, which is actually a python builtin module in 3.7+ onwards.

The dataclass-wizard library will convert your object (and all its attributes recursively) to a dict, and makes the reverse (de-serialization) pretty straightforward too, with fromdict. Also, here is the PyPi link: https://pypi.org/project/dataclass-wizard/.

Disclaimer: I am the creator and maintener of this library.

import dataclass_wizard
import dataclasses

@dataclasses.dataclass
class A:
    hello: str
    a_field: int

obj = A('world', 123)
a_dict = dataclass_wizard.asdict(obj)
# {'hello': 'world', 'aField': 123}

Or if you wanted a string:

a_str = jsons.dumps(dataclass_wizard.asdict(obj))

Or if your class extended from dataclass_wizard.JSONWizard:

a_str = your_object.to_json()

Finally, the library also supports dataclasses in Union types, which basically means that a dict can be de-serialized into an object of either class C1 or C2. For example:

from dataclasses import dataclass

from dataclass_wizard import JSONWizard

@dataclass
class Outer(JSONWizard):

    class _(JSONWizard.Meta):
        tag_key = 'tag'
        auto_assign_tags = True

    my_string: str
    inner: 'A | B'  # alternate syntax: `inner: typing.Union['A', 'B']`

@dataclass
class A:
    my_field: int

@dataclass
class B:
    my_field: str


my_dict = {'myString': 'test', 'inner': {'tag': 'B', 'myField': 'test'}}
obj = Outer.from_dict(my_dict)

# True
assert repr(obj) == "Outer(my_string='test', inner=B(my_field='test'))"

obj.to_json()
# {"myString": "test", "inner": {"myField": "test", "tag": "B"}}

Upvotes: -1

user2138149
user2138149

Reputation: 17268

Given that there is no "standardized" way to perform Serialization and Deserialization in Python (compare what Python has to offer to Rust which is an alternative language which I happen to know about which does Serialization and Deserialization well) I think what would be helpful is to have an answer which collects together a summary of the possible approaches along with their advantages, disadvantages and performance comparisons.

I cannot provide all this information myself, at least not all at once. So I will start off by providing some information and leave this answer for others to edit and contribute to. I will provide a summary of the most notable answers thus far. For the ones I have missed please freely edit this question or comment and someone will update it. (Hopefully)

When this becomes "production ready" I will clean up this preamble to remove it. My aim would be for this to become a long-term reference which provides the relevant information succinctly, rather than have it be distributed across a large number of individual answers, each arguing their case for why they should be used.

General

  • Serialization is a many-to-1 operation, meaning that once serialized type information is lost and the same serialized string could represent infinitly many possible parent types. The obvious example is that of a Set and a list. These are two different objects (types), which could contain the same set of elements, would be serialized in the same way.
  • Many languages solve this problem by explicitly providing type information as part of the deserialization function call. For example Type::deserialize() or deserialize(..., type=Type). This is not code for any particular language, it is simply here to present how type information might be present in code.

json

Advantages:

  • Native to Python
  • Will serialize basic types: dict, list, str, int, float, bool, None

Disadvantages:

  • Does not serialize recursively, meaning if one Python object contains another then the containing object is not serializable
  • Does not serialize common types like datetime.datetime or datetime.date

jsons

Advantages:

  • Correctly serializes and deserializes recursively nested types (?)
  • Correctly serializes and deserializes common types like datetime objects (?)

Disadvantages:

  • Slow (?)

jsonpickle

Advantages:

  • Correctly serializes and deserializes recursively nested types (?)
  • Correctly serializes and deserializes common types like datetime objects (?)

Disadvantages:

  • Type information is present in the encoded (serialized) output.
  • This means that the encoded output is more verbose, contains fields that you might not expect to see (type info) and the encoding is "special" to both Python and the jsonpickle library.
  • If you deserialize this in another languages or using another Python library, you will obviously not have access to the same behaviours. (In other words the code you write will behave differently. This is sort of obvious and goes without saying.)
  • You can suppress the type information using an argument unpicklable=False
  • Might be slow? (Citation required: please edit to add performance comparisons)

inheriting from dict, using Python's inbuilt json library

Advantages:

  • Works without requiring other libraries
  • Minimal boilerplate

Disadvantages:

  • Does not work for non-serializable types like datetime
  • Does not work for nested types (?)
  • Since your type is probably not actually a dict but something else, this violates fundamental principles of OOP design

** Considerations:**

  • You can use __getattr__ and __setattr__ methods so that it will use the dict values for any undefined attributes, see answer by Sunding Wei

Use composition over inheritance, aka wrap a dict

Advantages:

  • Does not violate OOP design compared to above alternative
  • Works for nested types, but requires a lot of boilerplate, better for things that do not have nesting

Disadvantages:

  • Unless more boilerplate is added, accessing elements of the dict requires more code, and the resulting code is less intuitive. If the dict object is named data_dict rather than accessing my_class.my_field one has to my_class.data_dict.my_field
  • Properties or getters/setters can mitigate this but that requires maintainance of 2N functions for N fields
  • Requires adding from_dict class method for deserialization and __json__ or to_json for serializing
  • As such, this is a more manual operation compared to the previously presented examples. That might be preferable/acceptable if explicit code is prefered for some cases where there is no nesting of types

Manual implementation by returning a dict as an interface type

  • See answer by Martin Thoma, it is similar to the above option of wrapping a dict

Advantages:

  • Uses explicit type information from TYPE.from_json class methods
  • Allows creating a class with explicitly named fields rather than keys in a dictionary

Disadvantages:

  • Requires two step loading instead of a single line of code
  • Requires some boilerplate
  • Since the serialization is being done by relying on conversion of the class to a dict structure, might consider using above method more straightforward
  • Does not work in cases where the fields include types like datetime

subclassing JSONEncoder and JSONDecoder

Advantages:

  • Leverages Python native json library
  • others?

Disadvantages:

  • Not a 1 line solution
  • Requires creating a class to serialize and deserialize every type you want to be able to serialize and deserialize

Side note: This looks like it should be the "canonical" choice ... but I'm not completely sure I understand it and the fact that this weird "hook" think is required makes me suspect it's perhaps not that generalizable? Maybe someone else can edit this section and clarify?

default=vars

Advantage:

  • Very quickly allows serialization of custom objects

Disadvantage:

  • Only works for types serializable with native json library, does not work for types like datetime
  • Does not work for nested types (?)

json-fix

See answer by Jeff Hykin

simplejson

todo

json-tricks

todo

jsonweb

todo

Upvotes: -1

rectangletangle
rectangletangle

Reputation: 53031

import json

class Foo(object):
    def __init__(self):
        self.bar = 'baz'
        self._qux = 'flub'

    def somemethod(self):
        pass

'''
The parameter default(obj) is a function that should return a 
serializable version of obj or raise TypeError. The default 
default simply raises TypeError. 

https://docs.python.org/3.4/library/json.html#json.dumps
'''
def default(instance):
    return {k: v
            for k, v in vars(instance).items()
            if not str(k).startswith('_')}

json_foo = json.dumps(Foo(), default=default)
assert '{"bar": "baz"}' == json_foo

print(json_foo)

Upvotes: 4

Artur Barseghyan
Artur Barseghyan

Reputation: 14212

A really simplistic one-liner solution

import json

json.dumps(your_object, default=vars)

The end!

What comes below is a test.

import json
from dataclasses import dataclass


@dataclass
class Company:
    id: int
    name: str

@dataclass
class User:
    id: int
    name: str
    email: str
    company: Company


company = Company(id=1, name="Example Ltd")
user = User(id=1, name="John Doe", email="[email protected]", company=company)


json.dumps(user, default=vars)

Output:

{
  "id": 1, 
  "name": "John Doe", 
  "email": "[email protected]", 
  "company": {
    "id": 1, 
    "name": "Example Ltd"
  }
}

Upvotes: 26

gecco
gecco

Reputation: 18860

For more complex classes you could consider the tool jsonpickle:

jsonpickle is a Python library for serialization and deserialization of complex Python objects to and from JSON.

The standard Python libraries for encoding Python into JSON, such as the stdlib’s json, simplejson, and demjson, can only handle Python primitives that have a direct JSON equivalent (e.g. dicts, lists, strings, ints, etc.). jsonpickle builds on top of these libraries and allows more complex data structures to be serialized to JSON. jsonpickle is highly configurable and extendable–allowing the user to choose the JSON backend and add additional backends.

Transform an object into a JSON string:

import jsonpickle
json_string = jsonpickle.encode(obj)

Recreate a Python object from a JSON string:

recreated_obj = jsonpickle.decode(json_string)

(link to jsonpickle on PyPi)

Upvotes: 282

PJ127
PJ127

Reputation: 1258

I don't know if that suits your needs, but using orjson as json and adding a dataclass decorator to your class solves the problem:

from dataclasses import dataclass

@dataclass()
class FileItem:
    def __init__(self, fname):
        self.fname = fname

import orjson as json
x = FileItem("/foo/bar")
json.dumps(x)
# -> returns b'{"fname":"/foo/bar"}'

Upvotes: 1

JoergVanAken
JoergVanAken

Reputation: 1286

If the object can pe pickled one can use the following two functions to decode and encode an object:

def obj_to_json(obj):
    pickled = pickle.dumps(obj)
    coded = base64.b64encode(pickled).decode('utf8')
    return json.dumps(coded)

def json_to_obj(s):
    coded = base64.b64decode(s)
    return pickle.loads(coded)

This is for example usefull in combination with pytest and config.cache.

Upvotes: 1

Sunding Wei
Sunding Wei

Reputation: 2234

The most simple answer

class Object(dict):
    def __init__(self):
        pass

    def __getattr__(self, key):
        return self[key]

    def __setattr__(self, key, value):
        self[key] = value

# test
obj = Object()
obj.name = "John"
obj.age = 25
obj.brothers = [ Object() ]
text = json.dumps(obj)

Now it gives you the output, don't change anything to json.dumps(...)

'{"name": "John", "age": 25, "brothers": [{}]}'

Upvotes: 9

andyhasit
andyhasit

Reputation: 15329

Most of the answers involve changing the call to json.dumps(), which is not always possible or desirable (it may happen inside a framework component for example).

If you want to be able to call json.dumps(obj) as is, then a simple solution is inheriting from dict:

class FileItem(dict):
    def __init__(self, fname):
        dict.__init__(self, fname=fname)

f = FileItem('tasks.txt')
json.dumps(f)  #No need to change anything here

This works if your class is just basic data representation, for trickier things you can always set keys explicitly in the call to dict.__init__().

This works because json.dumps() checks if the object is one of several known types via a rather unpythonic isinstance(value, dict) - so it would be possible to fudge this with __class__ and some other methods if you really don't want to inherit from dict.

Upvotes: 239

xscorp7
xscorp7

Reputation: 311

We often dump complex dictionaries in JSON format in log files. While most of the fields carry important information, we don't care much about the built-in class objects(for example a subprocess.Popen object). Due to presence of unserializable objects like these, call to json.dumps() fails.

To get around this, I built a small function that dumps object's string representation instead of dumping the object itself. And if the data structure you are dealing with is too nested, you can specify the nesting maximum level/depth.

from time import time

def safe_serialize(obj , max_depth = 2):

    max_level = max_depth

    def _safe_serialize(obj , current_level = 0):

        nonlocal max_level

        # If it is a list
        if isinstance(obj , list):

            if current_level >= max_level:
                return "[...]"

            result = list()
            for element in obj:
                result.append(_safe_serialize(element , current_level + 1))
            return result

        # If it is a dict
        elif isinstance(obj , dict):

            if current_level >= max_level:
                return "{...}"

            result = dict()
            for key , value in obj.items():
                result[f"{_safe_serialize(key , current_level + 1)}"] = _safe_serialize(value , current_level + 1)
            return result

        # If it is an object of builtin class
        elif hasattr(obj , "__dict__"):
            if hasattr(obj , "__repr__"):
                result = f"{obj.__repr__()}_{int(time())}"
            else:
                try:
                    result = f"{obj.__class__.__name__}_object_{int(time())}"
                except:
                    result = f"object_{int(time())}"
            return result

        # If it is anything else
        else:
            return obj

    return _safe_serialize(obj)

Since a dictionary can also have unserializable keys, dumping their class name or object representation will lead to all keys with same name, which will throw error as all keys need to have unique name, that is why the current time since epoch is appended to object names with int(time()).

This function can be tested with the following nested dictionary with different levels/depths-

d = {
    "a" : {
        "a1" : {
            "a11" : {
                "a111" : "some_value" ,
                "a112" : "some_value" ,
            } ,
            "a12" : {
                "a121" : "some_value" ,
                "a122" : "some_value" ,
            } ,
        } ,
        "a2" : {
            "a21" : {
                "a211" : "some_value" ,
                "a212" : "some_value" ,
            } ,
            "a22" : {
                "a221" : "some_value" ,
                "a222" : "some_value" ,
            } ,
        } ,
    } ,
    "b" : {
        "b1" : {
            "b11" : {
                "b111" : "some_value" ,
                "b112" : "some_value" ,
            } ,
            "b12" : {
                "b121" : "some_value" ,
                "b122" : "some_value" ,
            } ,
        } ,
        "b2" : {
            "b21" : {
                "b211" : "some_value" ,
                "b212" : "some_value" ,
            } ,
            "b22" : {
                "b221" : "some_value" ,
                "b222" : "some_value" ,
            } ,
        } ,
    } ,
    "c" : subprocess.Popen("ls -l".split() , stdout = subprocess.PIPE , stderr = subprocess.PIPE) ,
}

Running the following will lead to-

print("LEVEL 3")
print(json.dumps(safe_serialize(d , 3) , indent = 4))

print("\n\n\nLEVEL 2")
print(json.dumps(safe_serialize(d , 2) , indent = 4))

print("\n\n\nLEVEL 1")
print(json.dumps(safe_serialize(d , 1) , indent = 4))

Result:

LEVEL 3
{
    "a": {
        "a1": {
            "a11": "{...}",
            "a12": "{...}"
        },
        "a2": {
            "a21": "{...}",
            "a22": "{...}"
        }
    },
    "b": {
        "b1": {
            "b11": "{...}",
            "b12": "{...}"
        },
        "b2": {
            "b21": "{...}",
            "b22": "{...}"
        }
    },
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}



LEVEL 2
{
    "a": {
        "a1": "{...}",
        "a2": "{...}"
    },
    "b": {
        "b1": "{...}",
        "b2": "{...}"
    },
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}



LEVEL 1
{
    "a": "{...}",
    "b": "{...}",
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}

[NOTE]: Only use this if you don't care about serialization of a built-in class object.

Upvotes: 0

Aman Goel
Aman Goel

Reputation: 55

Whomever wants to use basic conversion without an external library, it is simply how you can override __iter__ & __str__ functions of the custom class using following way.

class JSONCustomEncoder(json.JSONEncoder):
    def default(self, obj):
        return obj.__dict__


class Student:
    def __init__(self, name: str, slug: str):
        self.name = name
        self.age = age

    def __iter__(self):
        yield from {
            "name": self.name,
            "age": self.age,
        }.items()

    def __str__(self):
        return json.dumps(
            self.__dict__, cls=JSONCustomEncoder, ensure_ascii=False
        )

Use the object by wrapping in a dict(), so that data remains preserved.

s = Student("aman", 24)
dict(s)

Upvotes: -1

NicoHood
NicoHood

Reputation: 1123

Why are you guys making it so complicated? Here is a simple example:

#!/usr/bin/env python3

import json
from dataclasses import dataclass

@dataclass
class Person:
    first: str
    last: str
    age: int

    @property
    def __json__(self):
        return {
            "name": f"{self.first} {self.last}",
            "age": self.age
        }

john = Person("John", "Doe", 42)
print(json.dumps(john, indent=4, default=lambda x: x.__json__))

This way you could also serialize nested classes, as __json__ returns a python object and not a string. No need to use a JSONEncoder, as the default parameter with a simple lambda also works fine.

I've used @property instead of a simple function, as this feels more natural and modern. The @dataclass is also just an example, it works for a "normal" class as well.

Upvotes: 5

Daniel Flippance
Daniel Flippance

Reputation: 7932

To throw another log on this 11 year old fire, I want a solution that meets the following criteria:

  • Allows an instance of class FileItem to be serialized using only json.dumps(obj)
  • Allows FileItem instances to have properties: fileItem.fname
  • Allows FileItem instances to be given to any library which will serialise it using json.dumps(obj)
  • Doesn't require any other fields to be passed to json.dumps (like a custom serializer)

IE:

fileItem = FileItem('filename.ext')
assert json.dumps(fileItem) == '{"fname": "filename.ext"}'
assert fileItem.fname == 'filename.ext'

My solution is:

  • Have obj's class inherit from dict
  • Map each object property to the underlying dict
class FileItem(dict):
    def __init__(self, fname):
        self['fname'] = fname

    #fname property
    fname: str = property()
    @fname.getter
    def fname(self):
        return self['fname']

    @fname.setter
    def fname(self, value: str):
        self['fname'] = value

    #Repeat for other properties

Yes, this is somewhat long winded if you have lots of properties, but it is JSONSerializable and it behaves like an object and you can give it to any library that's going to json.dumps(obj) it.

Upvotes: 6

Fancy John
Fancy John

Reputation: 39428

Just add to_json method to your class like this:

def to_json(self):
  return self.message # or how you want it to be serialized

And add this code (from this answer), to somewhere at the top of everything:

from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder().default
JSONEncoder.default = _default

This will monkey-patch json module when it's imported, so JSONEncoder.default() automatically checks for a special to_json() method and uses it to encode the object if found.

Just like Onur said, but this time you don't have to update every json.dumps() in your project.

Upvotes: 111

user1587520
user1587520

Reputation: 4533

As mentioned in many other answers you can pass a function to json.dumps to convert objects that are not one of the types supported by default to a supported type. Surprisingly none of them mentions the simplest case, which is to use the built-in function vars to convert objects into a dict containing all their attributes:

json.dumps(obj, default=vars)

Note that this covers only basic cases, if you need more specific serialization for certain types (e.g. exluding certain attributes or for objects that don't have a __dict__ attribute) you need to use a custom function or a JSONEncoder as desribed in the other answers.

Upvotes: 215

R H
R H

Reputation: 2304

If you're using Python3.5+, you could use jsons. (PyPi: https://pypi.org/project/jsons/) It will convert your object (and all its attributes recursively) to a dict.

import jsons

a_dict = jsons.dump(your_object)

Or if you wanted a string:

a_str = jsons.dumps(your_object)

Or if your class implemented jsons.JsonSerializable:

a_dict = your_object.json

Upvotes: 49

tryer3000
tryer3000

Reputation: 829

import simplejson

class User(object):
    def __init__(self, name, mail):
        self.name = name
        self.mail = mail

    def _asdict(self):
        return self.__dict__

print(simplejson.dumps(User('alice', '[email protected]')))

if using standard json, you need to define a default function

import json
def default(o):
    return o._asdict()

print(json.dumps(User('alice', '[email protected]'), default=default))

Upvotes: 15

tobigue
tobigue

Reputation: 3617

I came across this problem the other day and implemented a more general version of an Encoder for Python objects that can handle nested objects and inherited fields:

import json
import inspect

class ObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        if hasattr(obj, "to_json"):
            return self.default(obj.to_json())
        elif hasattr(obj, "__dict__"):
            d = dict(
                (key, value)
                for key, value in inspect.getmembers(obj)
                if not key.startswith("__")
                and not inspect.isabstract(value)
                and not inspect.isbuiltin(value)
                and not inspect.isfunction(value)
                and not inspect.isgenerator(value)
                and not inspect.isgeneratorfunction(value)
                and not inspect.ismethod(value)
                and not inspect.ismethoddescriptor(value)
                and not inspect.isroutine(value)
            )
            return self.default(d)
        return obj

Example:

class C(object):
    c = "NO"
    def to_json(self):
        return {"c": "YES"}

class B(object):
    b = "B"
    i = "I"
    def __init__(self, y):
        self.y = y
        
    def f(self):
        print "f"

class A(B):
    a = "A"
    def __init__(self):
        self.b = [{"ab": B("y")}]
        self.c = C()

print json.dumps(A(), cls=ObjectEncoder, indent=2, sort_keys=True)

Result:

{
  "a": "A", 
  "b": [
    {
      "ab": {
        "b": "B", 
        "i": "I", 
        "y": "y"
      }
    }
  ], 
  "c": {
    "c": "YES"
  }, 
  "i": "I"
}

Upvotes: 36

Quinten C
Quinten C

Reputation: 771

This function uses recursion to iterate over every part of the dictionary and then calls the repr() methods of classes that are not build-in types.

def sterilize(obj):
    object_type = type(obj)
    if isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    elif object_type in (list, tuple):
        return [sterilize(v) for v in obj]
    elif object_type in (str, int, bool, float):
        return obj
    else:
        return obj.__repr__()

Upvotes: 0

Wolfgang Fahl
Wolfgang Fahl

Reputation: 15594

Kyle Delaney's comment is correct so i tried to use the answer https://stackoverflow.com/a/15538391/1497139 as well as an improved version of https://stackoverflow.com/a/10254820/1497139

to create a "JSONAble" mixin.

So to make a class JSON serializeable use "JSONAble" as a super class and either call:

 instance.toJSON()

or

 instance.asJSON()

for the two offered methods. You could also extend the JSONAble class with other approaches offered here.

The test example for the Unit Test with Family and Person sample results in:

toJSOn():

{
    "members": {
        "Flintstone,Fred": {
            "firstName": "Fred",
            "lastName": "Flintstone"
        },
        "Flintstone,Wilma": {
            "firstName": "Wilma",
            "lastName": "Flintstone"
        }
    },
    "name": "The Flintstones"
}

asJSOn():

{'name': 'The Flintstones', 'members': {'Flintstone,Fred': {'firstName': 'Fred', 'lastName': 'Flintstone'}, 'Flintstone,Wilma': {'firstName': 'Wilma', 'lastName': 'Flintstone'}}}

Unit Test with Family and Person sample

def testJsonAble(self):
        family=Family("The Flintstones")
        family.add(Person("Fred","Flintstone")) 
        family.add(Person("Wilma","Flintstone"))
        json1=family.toJSON()
        json2=family.asJSON()
        print(json1)
        print(json2)

class Family(JSONAble):
    def __init__(self,name):
        self.name=name
        self.members={}
    
    def add(self,person):
        self.members[person.lastName+","+person.firstName]=person

class Person(JSONAble):
    def __init__(self,firstName,lastName):
        self.firstName=firstName;
        self.lastName=lastName;

jsonable.py defining JSONAble mixin

 '''
Created on 2020-09-03

@author: wf
'''
import json

class JSONAble(object):
    '''
    mixin to allow classes to be JSON serializable see
    https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable
    '''

    def __init__(self):
        '''
        Constructor
        '''
    
    def toJSON(self):
        return json.dumps(self, default=lambda o: o.__dict__, 
            sort_keys=True, indent=4)
        
    def getValue(self,v):
        if (hasattr(v, "asJSON")):
            return v.asJSON()
        elif type(v) is dict:
            return self.reprDict(v)
        elif type(v) is list:
            vlist=[]
            for vitem in v:
                vlist.append(self.getValue(vitem))
            return vlist
        else:   
            return v
    
    def reprDict(self,srcDict):
        '''
        get my dict elements
        '''
        d = dict()
        for a, v in srcDict.items():
            d[a]=self.getValue(v)
        return d
    
    def asJSON(self):
        '''
        recursively return my dict elements
        '''
        return self.reprDict(self.__dict__)   

You'll find these approaches now integrated in the https://github.com/WolfgangFahl/pyLoDStorage project which is available at https://pypi.org/project/pylodstorage/

Upvotes: 3

Tobi
Tobi

Reputation: 531

This is a small library that serializes an object with all its children to JSON and also parses it back:

https://github.com/tobiasholler/PyJSONSerialization/

Upvotes: 0

mheyman
mheyman

Reputation: 4323

Building on Quinten Cabo's answer:

def sterilize(obj):
    """Make an object more ameniable to dumping as json
    """
    if type(obj) in (str, float, int, bool, type(None)):
        return obj
    elif isinstance(obj, dict):
        return {k: sterilize(v) for k, v in obj.items()}
    list_ret = []
    dict_ret = {}
    for a in dir(obj):
        if a == '__iter__' and callable(obj.__iter__):
            list_ret.extend([sterilize(v) for v in obj])
        elif a == '__dict__':
            dict_ret.update({k: sterilize(v) for k, v in obj.__dict__.items() if k not in ['__module__', '__dict__', '__weakref__', '__doc__']})
        elif a not in ['__doc__', '__module__']:
            aval = getattr(obj, a)
            if type(aval) in (str, float, int, bool, type(None)):
                dict_ret[a] = aval
            elif a != '__class__' and a != '__objclass__' and isinstance(aval, type):
                dict_ret[a] = sterilize(aval)
    if len(list_ret) == 0:
        if len(dict_ret) == 0:
            return repr(obj)
        return dict_ret
    else:
        if len(dict_ret) == 0:
            return list_ret
    return (list_ret, dict_ret)

The differences are

  1. Works for any iterable instead of just list and tuple (it works for NumPy arrays, etc.)
  2. Works for dynamic types (ones that contain a __dict__).
  3. Includes native types float and None so they don't get converted to string.
  4. Classes that have __dict__ and members will mostly work (if the __dict__ and member names collide, you will only get one - likely the member)
  5. Classes that are lists and have members will look like a tuple of the list and a dictionary
  6. Python3 (that isinstance() call may be the only thing that needs changing)

Upvotes: 2

Sheikh Abdul Wahid
Sheikh Abdul Wahid

Reputation: 2773

class DObject(json.JSONEncoder):
    def delete_not_related_keys(self, _dict):
        for key in ["skipkeys", "ensure_ascii", "check_circular", "allow_nan", "sort_keys", "indent"]:
            try:
                del _dict[key]
            except:
                continue

    def default(self, o):
        if hasattr(o, '__dict__'):
            my_dict = o.__dict__.copy()
            self.delete_not_related_keys(my_dict)
            return my_dict
        else:
            return o

a = DObject()
a.name = 'abdul wahid'
b = DObject()
b.name = a

print(json.dumps(b, cls=DObject))

Upvotes: 2

Manoj Govindan
Manoj Govindan

Reputation: 74775

Do you have an idea about the expected output? For example, will this do?

>>> f  = FileItem("/foo/bar")
>>> magic(f)
'{"fname": "/foo/bar"}'

In that case you can merely call json.dumps(f.__dict__).

If you want more customized output then you will have to subclass JSONEncoder and implement your own custom serialization.

For a trivial example, see below.

>>> from json import JSONEncoder
>>> class MyEncoder(JSONEncoder):
        def default(self, o):
            return o.__dict__    

>>> MyEncoder().encode(f)
'{"fname": "/foo/bar"}'

Then you pass this class into the json.dumps() method as cls kwarg:

json.dumps(cls=MyEncoder)

If you also want to decode then you'll have to supply a custom object_hook to the JSONDecoder class. For example:

>>> def from_json(json_object):
        if 'fname' in json_object:
            return FileItem(json_object['fname'])
>>> f = JSONDecoder(object_hook = from_json).decode('{"fname": "/foo/bar"}')
>>> f
<__main__.FileItem object at 0x9337fac>
>>> 

Upvotes: 730

Adi Degani
Adi Degani

Reputation: 347

First we need to make our object JSON-compliant, so we can dump it using the standard JSON module. I did it this way:

def serialize(o):
    if isinstance(o, dict):
        return {k:serialize(v) for k,v in o.items()}
    if isinstance(o, list):
        return [serialize(e) for e in o]
    if isinstance(o, bytes):
        return o.decode("utf-8")
    return o

Upvotes: 0

Tangibleidea
Tangibleidea

Reputation: 474

In addition to the Onur's answer, You possibly want to deal with datetime type like below.
(in order to handle: 'datetime.datetime' object has no attribute 'dict' exception.)

def datetime_option(value):
    if isinstance(value, datetime.date):
        return value.timestamp()
    else:
        return value.__dict__

Usage:

def toJSON(self):
    return json.dumps(self, default=datetime_option, sort_keys=True, indent=4)

Upvotes: 0

Related Questions