Reputation: 13
I am attempting to serialize a Pydantic model schema and then deserialize it in another script. The serialization process is working as expected, and it has created two JSON files: model.json
and data.json
.
In test_save.py
, I defined the MainModel
schema and then serialized it along with an instance of MainModel
. The resulting JSON files contain the schema and data, respectively.
test_save.py
from pydantic import BaseModel
import json
# Model definition
class MainModel(BaseModel):
foo: str
# Serialize MainModel
with open('model.json', 'w') as f:
json.dump(MainModel.schema(), f, indent=4)
# Create Instance of MainModel
maindata = MainModel(foo = 'bar')
# Serialize Model Instance
with open('data.json', 'w') as f:
json.dump(maindata.dict(), f, indent=4)
model.json
{
"title": "MainModel",
"type": "object",
"properties": {
"foo": {
"title": "Foo",
"type": "string"
}
},
"required": [
"foo"
]
}
data.json
{
"foo": "bar"
}
In test_load.py
, I'm attempting to deserialize the model.json
and data.json
files. The create_model
function from pydantic
is used to define the MainModel
schema based on the schema in model.json
test_load.py
import pydantic
import json
with open('model.json', 'r') as f:
j = json.load(f)
MainModel = pydantic.create_model('MainModel', **j)
with open('data.json', 'r') as f:
maindata = json.load(f)
# modelinstance = MainModel.validate(maindata)
modelinstance = MainModel.parse_obj(maindata)
print(json.dumps(modelinstance.dict(), indent=4)) # This print the schema instead of data.
I'm attempting to deserialize a Pydantic model instance using the schema stored in model.json
and data stored in data.json
. However, when I run the script, instead of printing the data as expected, the script prints the schema from model.json
.
The expected output from test_load.py
is:
{
"foo": "bar"
}
But the actual output is:
{
"title": "MainModel",
"type": "object",
"properties": {
"foo": {
"title": "Foo",
"type": "string"
}
},
"required": [
"foo"
]
}
I'm not sure what I'm doing wrong. Can anyone help me identify the issue?
Upvotes: 1
Views: 3064
Reputation: 16
You can even create the model 'on the fly' and refer to it in your code:
from datamodel_code_generator import generate
from pathlib import Path
# saves model to file
generate(
input_=Path("your_model_schema.json"),
input_file_type="jsonschema",
output=Path("your_model.py")
)
And for using it in your code:
import importlib
Model = getattr(importlib.import_module("your_model"), 'Model')
path = Path('data_file.json')
your_json_data = Model.parse_file(path)
Tested with Python 3.10.
Upvotes: 0
Reputation: 18683
The
create_model
function frompydantic
is used to define theMainModel
schema based on the schema inmodel.json
That is the error.
I am not sure what made you think that this function takes a (dict
-parsed) JSON schema as keyword arguments, but that is not how it works. To quote from the documentation of the create_model
function:
Fields are defined by either a tuple of the form
(<type>, <default value>)
or just a default value.
What happened in your case is that the dictionary containing the model schema was unpacked as keyword arguments and thus it constructed a model with the fields named title
, type
, properties
, and required
.
The first two were just passed strings ("MainModel"
and "object"
respectively), which the model creation function interpreted as the default values, inferring the type of both those fields to be str
. The properties
argument ended up being a dictionary, again interpreted as the default value for a field of the type dict
and the last one got a list, which was again set as the default value for the required
field of type list
.
You can verify this by printing MainModel.schema_json(indent=4)
after your create_model
call.
Lastly, since you therefore had default values for all four fields and the default config setting for extra attributes is Extra.ignore
, parsing your data.json
resulted in the foo
key to just be ignored, whereas all other fields were assigned their default values. That is how you ended up with a model instance that looked exactly like the model schema.
If you want to generate Pydantic models from a JSON schema, there is no built-in functionality for this. But the Pydantic docs link the datamodel-code-generator
package, which can do that (and more). But since this is code generation, it will not work at runtime. You'll obviously need to call the code generator first before launching the program that attempts to use those models.
Upvotes: 1