Reputation: 3291
The question title captures what I want to do. But to make it concrete, let's assume I have this schema (which is a schema definition in terms of python-jsonschema: https://python-jsonschema.readthedocs.io/en/stable/):
schema = {
"type" : "object",
"properties" : {
"price" : {"type" : "number"},
"name" : {"type" : "string"},
},
}
With this being a valid document:
{"name" : "Eggs", "price" : 34.99}
and these two classes:
class Price:
def __init__(self, price: float):
self.price = price
class Object:
def __init__(self, name: str, price: float):
self.name = name
self.price = Price(price)
Below I list common solutions and my reservations with them. My question is thus whether either there is either another method I don't know of, or if my reservations for any of the below solutions is misplaced. Or, simply put, what is best practice?
Method 1: Use objects' __dict__ representation and then serialise with json.dumps() (see this answer).
Reservation: This couples the object to the schema. For example, I now have to name object properties with the property names required by the schema. The reasons why this is bad are obvious: backward compatibility problems, conflicts with coding style guide, design lock-in etc. There is also a comment on the linked answer with 47 upvotes arguing that "Using __dict__ will not work in all cases. If the attributes have not been set after the object was instantiated, __dict__ may not be fully populated." I confess I don't understand. But it sounds bad.
Method 2: Subclass JSONEncoder (also in this answer)
Reservation: This seems helpful organisationally but it still begs the question of how to implement the default method without calling __dict__ anyway and having the same problem as above.
Method 3: write a custom asdict on each class. This is what I'm currently doing. It looks something like this:
class Price:
def __init__(self, price: float):
self.price = price
def asdict(self):
# if we had:
# return {"price": self.price}
# Then the nesting mismatch between the class hierachy and the schema would cause a problem.
return self.price
class Object:
def __init__(self, name: str, price: float):
self.name = name
self.price = Price(price)
def asdict(self):
return {"name": self.name, "price": self.price.asdict()}
Reservations: most clearly, there is now the problem that my class hierachy becomes coupled to the nesting structure of the schema. You can see above the problem this has caused. More seriously though, it means my serialisation definition is spread over multiple asdict() methods in multiple different classes. What I want is to have a file called "serializers.py" that completely specifies the process of converting my class hierachy to JSON. Not disperate asdict() methods all over my code.
Any advice?
Upvotes: 1
Views: 1087
Reputation: 27321
You'll need to subclass JSONEncoder for any non-trivial tasks, but this subclass can also look for conventions in the objects it serializes (like your asdict method), and fall back to the __dict__
.
I've written a module named dkjason
which you can look at for inspiration (it's also on PyPI), here is the meat of it (we call our asdict method __json__()
):
class DkJSONEncoder(json.JSONEncoder):
"""Handle special cases, like Decimal...
"""
def default(self, obj): # pylint:disable=R0911
if isinstance(obj, decimal.Decimal):
return float(obj)
if hasattr(obj, '__json__'):
return obj.__json__()
if isinstance(obj, set):
return list(obj)
if isinstance(obj, ttcal.Year):
return dict(year=obj.year, kind='YEAR')
if isinstance(obj, ttcal.Duration):
return '@duration:%d' % obj.toint()
if isinstance(obj, datetime.datetime):
return '@datetime:%s' % obj.isoformat()
if isinstance(obj, datetime.date):
return '@date:%s' % obj.isoformat()
if isinstance(obj, datetime.time):
return dict(hour=obj.hour,
minute=obj.minute,
second=obj.second,
microsecond=obj.microsecond,
kind="TIME")
if isinstance(obj, QuerySet):
return list(obj)
if hasattr(obj, '__dict__'):
return dict((k, v) for k, v in obj.__dict__.items()
if not k.startswith('_'))
return super(DkJSONEncoder, self).default(obj)
and a convenience method to call it:
def dumps(val, indent=4, sort_keys=True, cls=DkJSONEncoder):
"""Dump json value, using our special encoder class.
"""
return json.dumps(val, indent=indent, sort_keys=sort_keys, cls=cls)
I would advise against having all serializers in a single file, since that prevents your scheme from being expanded on by anyone that can't change your source code. You can, as you see above, put as much or as little as you want to centralize into the JSONEncoder subclass.
Upvotes: 2