Reputation: 5954
I have the following class:
class Thing:
def __init__(self, x: str):
self.x = x
def __str__(self):
return self.x
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, v: str) -> "Thing":
return cls(v)
Due to the validator method I can use this class as custom field type in a Pydantic model:
from pydantic import BaseModel
from thing import Thing
class Model(BaseModel):
thing: Thing
But if I want to serialize to JSON I need to set the json_encoders
option on the Pydantic model:
class Model(BaseModel):
class Config:
json_encoders = {
Thing: str
}
thing: Thing
Now Pydantic can serialize Thing
s to JSON and back. But the config is in two places: Partly on the Model
and partly on the class Thing
. I'd like to set it all on Thing
.
Is there any way to set the json_encoders
option on Thing
so Pydantic knows how to handle it transparently?
Note that Thing
is minimized here: It has a lot of logic and I'm not just trying to declare a custom str
type.
Upvotes: 1
Views: 1444
Reputation: 18663
This is actually an issue that goes much deeper than Pydantic models in my opinion. I found this ongoing discussion about whether a standard protocol with a method like __json__
or __serialize__
should be introduced in Python.
The problem is that Pydantic is confined by those same limitations of the standard library's json
module, in that encoding/serialization logic for custom types is separated from the class itself.
Whether or not the broader idea of introducing such a protocol makes sense, we can piggy-back off of it a little to define a customized version of json.dumps
that checks for the presence of e.g. a __serialize__
method and uses that as the default
function to serialize the object. (See the json.dump
documentation for an explanation of the default
parameter.)
Then we can set up a custom base model with the Config.json_dumps
option set to that function. That way all child models would automatically fall back to that for serialization (unless overridden by the encoder
argument to the BaseModel.json
method for example).
Here is an example:
base.py
from collections.abc import Callable
from json import dumps as json_dumps
from typing import Any
from pydantic import BaseModel as PydanticBaseModel
def json_dumps_extended(obj: object, **kwargs: Any) -> str:
default: Callable[[object], object] = kwargs.pop("default", lambda x: x)
def custom_default(to_encode: object) -> object:
serialize_method = getattr(to_encode, "__serialize__", None)
if serialize_method is None:
return default(to_encode)
return serialize_method() # <-- already bound to `to_encode`
return json_dumps(obj, default=custom_default, **kwargs)
class BaseModel(PydanticBaseModel):
class Config:
json_dumps = json_dumps_extended
application.py
from __future__ import annotations
from collections.abc import Callable, Iterator
from .base import BaseModel
class Thing:
def __init__(self, x: str) -> None:
self.x = x
def __str__(self) -> str:
return self.x
def __serialize__(self) -> str: # <-- this is the magic method
return self.x
@classmethod
def __get_validators__(cls) -> Iterator[Callable[..., Thing]]:
yield cls.validate
@classmethod
def validate(cls, v: str) -> Thing:
return cls(v)
class Model(BaseModel):
thing: Thing
num: float = 3.14
instance = Model(thing=Thing("foo"))
print(instance.json(indent=4))
Output:
{
"thing": "foo",
"num": 3.14
}
Note for Python <3.9
users: Import the Callable
and Iterator
types from typing
instead of collections.abc
.
If you want to be able to re-use this approach to serialization in more places than just the base model, it may be a good idea to put a bit more effort into the types. A runtime_checkable
custom protocol for our __serialize__
method may be useful.
Also we can make the json_dumps_extended
method a bit less clunky by using functools.partial
.
Here is a slightly more sophisticated version of the suggested base.py
:
from collections.abc import Callable
from functools import partial
from json import dumps as json_dumps
from typing import Any, Optional, Protocol, TypeVar, overload, runtime_checkable
from pydantic import BaseModel as PydanticBaseModel
T = TypeVar("T")
T_co = TypeVar("T_co", covariant=True)
Func1Arg = Callable[[object], T]
@runtime_checkable
class Serializable(Protocol[T_co]):
def __serialize__(self) -> T_co: ...
@overload
def serialize(obj: Serializable[T_co]) -> T_co: ...
@overload
def serialize(obj: Any, fallback: Func1Arg[T]) -> T: ...
def serialize(obj: Any, fallback: Optional[Func1Arg[Any]] = None) -> Any:
if isinstance(obj, Serializable):
return obj.__serialize__()
if fallback is None:
raise TypeError(f"Object not serializable: {obj}")
return fallback(obj)
def _id(x: T) -> T: return x
def json_dumps_extended(obj: object, **kwargs: Any) -> str:
custom_default = partial(serialize, fallback=kwargs.pop("default", _id))
return json_dumps(obj, default=custom_default, **kwargs)
class BaseModel(PydanticBaseModel):
class Config:
json_dumps = json_dumps_extended
Another alternative might have been to just monkey-patch JSONEncoder.default
directly. But without further configurations Pydantic seems to still perform the type checks itself and prevent serialization before that method is even called.
I don't think we have a better option, until some standard serialization protocol (at least for JSON) is introduced.
Upvotes: 4