Reputation: 1777
I have a working model to receive a json
data set using pydantic
. The model data set looks like this:
data = {'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56}
What I would like to do is have a list of json
files as the data set and be able to validate them. Ultimately the list will be converted to records in pandas
for further processing. My goal is to validate an arbitrarily long list of json
entries that looks something like this:
bigger_data = [{'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56},
{'thing_number': 456,
'thing_description': 'cow',
'thing_amount': 7.89}]
The basic setup I have now is as follows. Note that adding the class ItemList
is part of the attempt to get the arbitrary length to work.
from typing import List
from pydantic import BaseModel
from pydantic.schema import schema
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
The basic code will then produce what I think I'm looking for in an array object that will take Item
objects.
item_schema = schema([ItemList])
print(json.dumps(item_schema, indent=2))
{
"definitions": {
"Item": {
"title": "Item",
"type": "object",
"properties": {
"thing_number": {
"title": "Thing_Number",
"type": "integer"
},
"thing_description": {
"title": "Thing_Description",
"type": "string"
},
"thing_amount": {
"title": "Thing_Amount",
"type": "number"
}
},
"required": [
"thing_number",
"thing_description",
"thing_amount"
]
},
"ItemList": {
"title": "ItemList",
"type": "object",
"properties": {
"each_item": {
"title": "Each_Item",
"type": "array",
"items": {
"$ref": "#/definitions/Item"
}
}
},
"required": [
"each_item"
]
}
}
}
The setup works on a singe json item being passed:
item = Item(**data)
print(item)
Item thing_number=123 thing_description='duck' thing_amount=4.56
But when I try and pass the single item into the ItemList
model it returns an error:
item_list = ItemList(**data)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-94-48efd56e7b6c> in <module>
----> 1 item_list = ItemList(**data)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 1 validation error for ItemList
each_item
field required (type=value_error.missing)
I've also tried passing bigger_data
into the array thinking that it would need to start as a list. that also returns an error - - Although, I at least have a better understanding of the dictionary error I can't figure out how to resolve.
item_list2 = ItemList(**data_big)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-100-8fe9a5414bd6> in <module>
----> 1 item_list2 = ItemList(**data_big)
TypeError: MetaModel object argument after ** must be a mapping, not list
Thanks.
Other Things I've Tried
I've tried passing the data into the specific key with a little more luck (maybe?).
item_list2 = ItemList(each_item=data_big)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-111-07e5c12bf8b4> in <module>
----> 1 item_list2 = ItemList(each_item=data_big)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 6 validation errors for ItemList
each_item -> 0 -> thing_number
field required (type=value_error.missing)
each_item -> 0 -> thing_description
field required (type=value_error.missing)
each_item -> 0 -> thing_amount
field required (type=value_error.missing)
each_item -> 1 -> thing_number
field required (type=value_error.missing)
each_item -> 1 -> thing_description
field required (type=value_error.missing)
each_item -> 1 -> thing_amount
field required (type=value_error.missing)
Upvotes: 21
Views: 72148
Reputation: 45092
Use TypeAdapter.
To convert from JSON str to a list[Item]:
items = TypeAdapter(list[Item]).validate_json(bigger_data_json)
To convert from list[dict] to list[Item]:
items = TypeAdapter(list[Item]).validate_python(bigger_data)
To convert from a list[Item] to a JSON str:
bigger_data_json = TypeAdapter(list[Item]).dump_json(items)
Upvotes: 0
Reputation: 31
You can use Pydantic RootModel
https://docs.pydantic.dev/latest/concepts/models/#rootmodel-and-custom-root-types
from pydantic import BaseModel, RootModel
class OneItem(BaseModel):
a: int
class ListItems(RootModel):
root: List[OneItem]
def __iter__(self):
return iter(self.root)
def __getitem__(self, item):
return self.root[item]
src = [{'a': 1}, {'a': 2}]
model = ListItems.model_validate(src)
[print(_) for _ in model]
dst = model.model_dump()
print(dst)
assert src == dst
Upvotes: 1
Reputation: 10948
The following also works, and does not require a root type.
To convert from a List[dict]
to a List[Item]
:
items = parse_obj_as(List[Item], bigger_data)
To convert from JSON str
to a List[Item]
:
items = parse_raw_as(List[Item], bigger_data_json)
To convert from a List[Item]
to a JSON str
:
from pydantic.json import pydantic_encoder
bigger_data_json = json.dumps(items, default=pydantic_encoder)
or with a custom encoder:
from pydantic.json import pydantic_encoder
def custom_encoder(**kwargs):
def base_encoder(obj):
if isinstance(obj, BaseModel):
return obj.dict(**kwargs)
else:
return pydantic_encoder(obj)
return base_encoder
bigger_data_json = json.dumps(items, default=custom_encoder(by_alias=True))
Upvotes: 35
Reputation: 511
what did the trick for me was fastapi.encoders.jsonable_encoder
(take a look at https://fastapi.tiangolo.com/tutorial/encoder/)
So in your case I have appended the "single" items to a list result
i.e. result.append(Item(thing_number=123, thing_description="duck", thing_amount=4.56))
and finally fastapi.JSONResponse(content=fastapi.encoders.jsonable_encoder(result))
Upvotes: 1
Reputation: 54541
To avoid having "each_item"
in the ItemList
, you can use the __root__
Pydantic keyword:
from typing import List
from pydantic import BaseModel
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
__root__: List[Item] # ⯇-- __root__
To build the item_list
:
just_data = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(__root__=just_data)
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
item_list.__root__.append(a_json_duck)
The web-frameworks supporting Pydantic often jsonify such ItemList
as a JSON array without intermediate __root__
keyword.
Upvotes: 29
Reputation: 557
from typing import List
from pydantic import BaseModel
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
Base on your code with each_item as a List of Item
a_duck = Item(thing_number=123, thing_description="duck", thing_amount=4.56)
print(a_duck.json())
a_list = ItemList(each_item=[a_duck])
print(a_list.json())
Generate the following output:
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
{"each_item": [{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}]}
using these as "entry json":
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
a_json_list = {
"each_item": [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
]
}
print(Item(**a_json_duck))
print(ItemList(**a_json_list))
Work just fine and generates:
Item thing_number=123 thing_description='duck' thing_amount=4.56
ItemList each_item=[<Item thing_number=123 thing_description='duck' thing_amount=4.56>]
We are just left with the only datas:
just_datas = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(each_item=just_datas)
print(item_list)
print(type(item_list.each_item[1]))
print(item_list.each_item[1])
Those works as expected:
ItemList each_item=[<Item thing_number=123 thing_description='duck'thing_amount=4.56>,<Item thin…
<class '__main__.Item'>
Item thing_number=456 thing_description='cow' thing_amount=7.89
So in case i'm missing something the pydantic librairy works as expected.
My pydantic version : 0.30 python 3.7.4
Reading from a lookalike file:
json_data_file = """[
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89}]"""
from io import StringIO
item_list2 = ItemList(each_item=json.load(StringIO(json_data_file)))
Work also fine.
Upvotes: 14