Reputation: 71
I'd like to dynamically create a Pydantic model from a dataclass, similar to how you can dynamically create a Marshmallow schema from a dataclass as in marshmallow-dataclass or https://stevenloria.com/dynamic-schemas-in-marshmallow/. Is there already a library or easy way to do this?
Some background - I prefer using a dataclass in my business logic rather than using the Pydantic model directly. I use the Pydantic model for serializing/deserializing data with camel-cased fields within my FastAPI app only. However, I find myself basically duplicating the dataclass definition which isn't efficient.
Sample Input:
from typing import List
from dataclasses import dataclass
@dataclass
class Item:
id: int = None
stuff: str = None
height: float = None
@dataclass
class Bag:
id: int = None
name: str = None
things: List[Item] = None
@dataclass
class Basket:
id: int = None
recipient: str = None
bags: List[Bag] = None
best_item: Item = None
Desired output:
from typing import List
from pydantic.main import BaseModel
def camel_case_converter(value: str):
parts = value.lower().split('_')
return parts[0] + ''.join(i.title() for i in parts[1:])
class CamelBaseModel(BaseModel):
class Config:
alias_generator = camel_case_converter
class Item(CamelBaseModel):
id: int = None
stuff: str = None
height: float = None
class Bag(CamelBaseModel):
id: int = None
name: str = None
things: List[Item] = None
class Basket(CamelBaseModel):
id: int = None
recipient: str = None
bags: List[Bag] = None
best_item: Item = None
Upvotes: 6
Views: 10641
Reputation: 86
To dynamically create a Pydantic model from a Python dataclass, you can use this simple approach by sub classing both BaseModel and the dataclass, although I don't guaranteed it will work well for all use cases but it works for mine where i need to generate a json schema from my dataclass specifically using the BaseModel model_json_schema() command for guided json use cases in openai whilst still keeping all my data type objects in @dataclass format.
from pydantic import BaseModel
from dataclasses import dataclass
from typing import Type
def dataclass_to_pydantic_model(kls: Type[dataclass]) -> Type[BaseModel]:
"""
Converts a standard dataclass to a Pydantic BaseModel.
Args:
kls (Type[dataclass]): The dataclass to convert.
Returns:
Type[BaseModel]: A Pydantic model class based on the dataclass.
"""
class BaseModelDataclass(BaseModel, kls): # Dynamically create the Pydantic model
pass
return BaseModelDataclass
Upvotes: 1
Reputation: 8157
I have a partial solution / workaround here. The problem actually seems deceptively complicated and may take some work for a really complete solution.
You see that this works as expected in the simple case with no nested dataclass models - however if you have nested dataclass models you will need to pass them into the pydantic model as dataclasses (or dump them as dicts and allow pydantic to do the conversion) in this implementation, which is a little bit unwieldy.
Have left a note as to how to improve - but will take some work to do the nitty gritty of it. Hopefully this is a good starting point and let you do what you need to do. If I have a spare moment at some time I may try and improve this code to change the nested model type later.
As a side note would say that in my own project I am ok with just writing the classes twice. Once as dataclasses and once as pydantic schema models. The reason is that the business logic should not be tightly coupled with what the schema returns - and usuaually you will find your internal model may start the same but will at some point will likely diverge anyway. ie you will want fields on your dataclass that are for backend use only and derived fields on your schema etc.
from typing import Type, Any, Dict
from pydantic import BaseModel, create_model
from dataclasses import fields, MISSING, dataclass
def camel_case_converter(value: str) -> str:
parts = value.lower().split('_')
return parts[0] + ''.join(i.title() for i in parts[1:])
class CamelBaseModel(BaseModel):
class Config:
alias_generator = camel_case_converter
populate_by_name = True
def model_from_dataclass(kls: 'StdlibDataclass') -> Type[BaseModel]:
"""Converts a stdlib dataclass to a pydantic BaseModel"""
field_definitions: Dict[str, Any] = {}
for field in fields(kls):
field_type = field.type
# add recursive functionality for nested dataclasses could get a bit tricky
# for example field type `list[Item]` ideally needs to be evaluated and converted
# to `list[ItemSchema]` there are workarounds though see below
default_value = field.default if field.default is not MISSING else ...
field_definitions[field.name] = (field_type, default_value)
model = create_model(
kls.__name__,
__base__=CamelBaseModel,
**field_definitions
)
return model
@dataclass
class Item:
id: int | None = None
stuff: str | None = None
height: float | None = None
@dataclass
class Bag:
id: int | None = None
name: str | None = None
things: list[Item] | None = None
@dataclass
class Basket:
id: int | None = None
recipient: str | None = None
bags: list[Bag] | None = None
best_item: Item | None = None
ItemSchema = model_from_dataclass(Item)
BagSchema = model_from_dataclass(Bag)
BasketSchema = model_from_dataclass(Basket)
# Testing the model
item = ItemSchema(id=1, stuff="example", height=5.9)
# will not work because `item` is not a dataclass
# bag = BagSchema(id=1, name="example", things=[item])
# this will work
item_dataclass = Item(id=1, stuff="example", height=5.9)
bag = BagSchema.model_validate({
'id': 1, 'name': "example", 'things': [item_dataclass]
})
# this will also work
bag = BagSchema.model_validate({
'id': 1,
'name': 'example',
'things': [item.model_dump()]
})
print(bag)
# output: id=1 name='example' things=[Item(id=1, stuff='example', height=5.9)
Upvotes: 0
Reputation: 457
Maybe something like this? (from https://github.com/samuelcolvin/pydantic/issues/1967#issuecomment-742698281)
from typing import Type
from pydantic import BaseModel
from pydantic.dataclasses import dataclass as pydantic_dataclass
from typing import List
from dataclasses import dataclass
def model_from_dataclass(kls: 'StdlibDataclass') -> Type[BaseModel]:
"""Converts a stdlib dataclass to a pydantic BaseModel"""
return pydantic_dataclass(kls).__pydantic_model__
@dataclass
class Item:
id: int = None
stuff: str = None
height: float = None
ItemBaseModel = model_from_dataclass(Item)
Upvotes: 4