hidden
hidden

Reputation: 192

Set defaults from JSON schema during validation

I try to make a validator that would set defaults from a JSON schema during validation.

I found this question: Trying to make JSON Schema validator in Python to set default values and adjusted it a bit. Since I use "jsonschema==3.2.0", I came up with such a code:

def _with_default_setter_extension(validator_class):
    """Extend validator class with defaults setter.

    With this extension, the validator class will set all defaults from a
    schema being validated to a validated instance.
    """

    def _set_defaults(validator, properties, instance, schema):
        if not validator.is_type(instance, "object"):
            return

        valid = True
        for prop, subschema in properties.items():
            if prop in instance:
                for error in validator.descend(
                    instance[prop],
                    subschema,
                    path=prop,
                    schema_path=prop,
                ):
                    valid = False
                    yield error

        # set defaults only when validation is successful
        if valid:
            # set root default when instance is empty
            if not instance and "default" in schema:
                instance.update(schema["default"])
                return

            for prop, subschema in properties.items():
                if "default" in subschema and not isinstance(instance, list):
                    instance.setdefault(prop, subschema["default"])

    return jsonschema.validators.extend(
        validator_class, {"properties": _set_defaults}
    )

It works good except one case which is important for me. I wrote such a test to prove it does not work for my case:

def test_defaults_from_oneOf_only_defaults_from_valid_schema_are_set():
    """When oneOf is used, I expect only defaults from the valid subschema to be set."""
    schema = {
        "oneOf": [
            {
                "properties": {
                    "p": {"enum": ["one"]},
                    "params": {"properties": {"q": {"default": 1}}},
                }
            },
            {
                "properties": {
                    "p": {"enum": ["two"]},
                    "params": {"properties": {"w": {"default": 2}}},
                }
            },
        ],
    }
    assert _VALIDATOR.validate({"p": "two", "params": {}}, schema) == {
        "p": "two",
        "params": {"w": 2},
    }

The test fails with this assertion error:

AssertionError: assert {'p': 'two', 'params': {'q': 1, 'w': 2}} == {'p': 'two', 'params': {'w': 2}}
  +{'p': 'two', 'params': {'q': 1, 'w': 2}}
  -{'p': 'two', 'params': {'w': 2}}
  Full diff:
  - {'p': 'two', 'params': {'w': 2}}
  + {'p': 'two', 'params': {'q': 1, 'w': 2}}
  ?

So we can see, that despite the first subschema is invalid, the default value ("q") from its "params" is set. With some debugging, I discovered that when you override only the "properties" validator, it lacks context. So when the first subschema "params" gets validated, I have no context telling me that "p" param validation failed and we are still in the same subschema.

Please, give me any insight into what I could try.

Upvotes: 2

Views: 785

Answers (2)

Lars Maxfield
Lars Maxfield

Reputation: 53

Instead of setting defaults during validation, have you considered first filling all defaults in an instance and then validating it?

You can use fill_default from jsonschema-fill-default to fill all missing defaults in an existing instance with its schema:

pip install jsonschema-fill-default
from jsonschema_fill_default import fill_default

schema = {
    "type": "object",
    "properties": {
        "key_1": {},
        "key_2": {
            "type": "string",
            "default": "do_not_overwrite_if_key_exists",
        },
        "key_3": {
            "type": "string",
            "default": "use_it_if_key_does_not_exist",
        },
    },
    "required": ["key_1"],
}

json_dict = {"key_1": "key_1_value", "key_2": "key_2_value"}

fill_default(json_dict, schema)  # Mutates instance!
>>> json_dict

{"key_1": "key_1_value", "key_2": "key_2_value", "key_3": "use_it_if_key_does_not_exist"}

The function you show only fills defaults in "properties", whereas fill_default works with all nested combinations of "properties", "allOf", "anyOf", "oneOf", "dependentSchemas", "if-then(-else)", "prefixItems", and "items" keywords of Draft 2020-12 JSON Schema.

Upvotes: 0

Guangyang Li
Guangyang Li

Reputation: 2821

You can find the words in the original jsonschema Q&A doc:

In this code, we add the default properties to each object before the properties are validated, so the default values themselves will need to be valid under the schema.

And you are running the validation and set_default with the extended validator. That is why the dict is updated still. You might want to run them with a separated validator for validation only first.

Upvotes: 0

Related Questions