Reputation: 1454
What is the right way to model a recursive data structure's schema in Cerberus?
from cerberus import Validator, schema_registry
schema_registry.add("leaf", {"value": {"type": "integer", "required": True}})
schema_registry.add("tree", {"type": "dict", "anyof_schema": ["leaf", "tree"]})
v = Validator(schema = {"root": {"type": "dict", "schema": "tree"}})
Error:
cerberus.schema.SchemaError: {'root': [{
'schema': [
'no definitions validate', {
'anyof definition 0': [{
'anyof_schema': ['must be of dict type'],
'type': ['null value not allowed'],
}],
'anyof definition 1': [
'Rules set definition tree not found.'
],
},
]},
]}
The above error indicating the need for a rules set definition for tree
:
from cerberus import Validator, schema_registry, rules_set_registry
schema_registry.add("leaf", {"value": {"type": "integer", "required": True}})
rules_set_registry.add("tree", {"type": "dict", "anyof_schema": ["leaf", "tree"]})
v = Validator(schema = {"root": {"type": "dict", "schema": "tree"}})
v.validate({"root": {"value": 1}})
v.errors
v.validate({"root": {"a": {"value": 1}}})
v.errors
v.validate({"root": {"a": {"b": {"c": {"value": 1}}}}})
v.errors
Output:
False
{'root': ['must be of dict type']}
for all 3 examples.
Ideally, I would like all the below documents to pass validation:
v = Validator(schema = {"root": {"type": "dict", "schema": "tree"}})
assert v.validate({"root": {"value": 1}}), v.errors
assert v.validate({"root": {"a": {"value": 1}}}), v.errors
assert v.validate({"root": {"a": {"b": {"c": {"value": 1}}}}}), v.errors
Upvotes: 1
Views: 84
Reputation: 1454
The below is not a complete solution.
If someone has a full working solution with cerberus
, please share it, and I will happily mark your answer as the solution.
The tree's leaves contain some keys that must match another part of the document I am validating. For this reason, I have an additional is_in
validation method in my custom Validator
. However, I couldn't find a good way to have a child validator for the leaves, while still keeping a reference to another part of the document at the root.
I have now spent more time "fighting" cerberus
than it would have taken me to implement a custom input validation function, hence may try that instead for now, or try jsonschema
. (EDIT: see attempt #4 below.)
cerberus
custom validatorHopefully, the below logic can still be useful to someone.
from cerberus import Validator
from typing import Any
class ManifestValidator(Validator):
def _validate_type_tree(self: Validator, value: Any) -> bool:
if not isinstance(value, dict):
return False
for v in value.values():
if isinstance(v, dict):
if all(key in v for key in KEYS):
schema = self._resolve_schema(SCHEMA)
validator = self._get_child_validator(
document_crumb=v,
schema_crumb=(v, "schema"),
root_document=self.root_document,
root_schema=self.root_schema,
schema=schema,
)
if not validator(v, update=self.update) or validator._errors:
self._error(validator._errors)
return False
elif not self._validate_type_tree(v):
return False
else:
return False
return True
def _validate_is_in(self: Validator, path: str, field: str, value: str) -> bool:
"""{'type': 'string'}"""
document = self.root_document
for element in path.split("."):
if element not in document:
self._error(field, f"{path} does not exist in {document}")
return False
document = document[element]
if not isinstance(document, list):
self._error(
field,
f"{path} does not point to a list but to {document} of type {type(document)}",
)
return False
if value not in document:
self._error(field, f"{value} is not present in {document} at {path}.")
return False
return True
jsonschema
+ custom validation logicfrom jsonschema import validate
SCHEMA = {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type" : "object",
"properties" : {
"root": {
"oneOf": [
{"$ref": "#/$defs/tree",},
{"$ref": "#/$defs/leaf",},
],
},
},
"required": [
"root",
],
"$defs": {
"tree": {
"type": "object",
"patternProperties": {
"^[a-z]+([_-][a-z]+)*$": {
"oneOf": [
{"$ref": "#/$defs/tree",},
{"$ref": "#/$defs/leaf",},
],
},
},
"additionalProperties": False,
},
"leaf": {
"type": "object",
"properties": {
# In reality, the leaf is a more complex object, but as a reduction of my problem:
"value": {
"type": "number",
},
},
"required": [
"value",
],
},
},
}
TREES = [
{"root": {"value": 1}},
{"root": {"a": {"value": 1}}},
{"root": {"a": {"b": {"c": {"value": 1}}}}},
{"root": {"a-subtree": {"b-subtree": {"c-subtree": {"value": 1}}}}},
]
for tree in TREES:
validate(tree, SCHEMA)
For my additional constraint (is_in
), JSON pointers / JSON relative pointers / $data
seem like they could be useful in simpler cases, but for what I needed, I decided to implement custom validation logic, after the jsonschema
validation, which was a good first step to prove that the document is well-formed.
Resources:
Upvotes: 0