mkierc
mkierc

Reputation: 1188

jsonschema - oneOf keyword behaves unexpectedly

I'm trying to validate a json payload in Python, using jsonschema 3.0.1, that roughly looks like that (simplified to the troublesome part):

{
    "request": {
        "topic": {
            "param1": "bleep beep topic",
            "param2": "bloop boop topic"
        },
        "message": {
            "param1": "bleep beep message",
            "param2": "bloop boop message"
        }
    }
}

A valid request is expected to have two fields: a topic and matching message.

Each of them can consist of either param1 only or both param1 and param2.

But it can't have neither topic with only param1 and body with both, nor a topic with both and body with only param2:

Because the content of one node depends on the content of another node, I wasn't able to use the dependencies keyword or the if-then-else construction, so I've tried to use the oneOf, and provide a list of valid subschemas with references to the one_param and both_params versions of the field, like so:

from jsonschema import validate

one_param = {
    "type": "object",
    "properties": {
        "param1": {
            "type": "string",
        }
    },
    "required": ["param1"]
}

both_params = {
    "type": "object",
    "properties": {
        "param1": {
            "type": "string",
        },
        "param2": {
            "type": "string",
        }
    },
    "required": ["param1", "param2"]
}

test_schema = {
    "type": "object",
    "properties": {
        "request": {
            "oneOf": [
                {
                    "type": "object",
                    "properties": {
                        "topic": one_param,
                        "message": one_param
                    },

                    "required": ["topic", "message"]
                },
                {
                    "type": "object",
                    "properties": {
                        "topic": both_params,
                        "message": both_params
                    },

                    "required": ["topic", "message"]
                }
            ],
        }
    }
}

The behavior of the validator is not what I expected: it fails on the case with both params, and successfully validates the case with one param or mismatched params.

Why does my validation schema not work as I've explained?


Here's entire test that I wrote for that purpose:

from jsonschema import validate

one_param = {
    "type": "object",
    "properties": {
        "param1": {
            "type": "string",
        }
    },
    "required": ["param1"]
}

both_params = {
    "type": "object",
    "properties": {
        "param1": {
            "type": "string",
        },
        "param2": {
            "type": "string",
        }
    },
    "required": ["param1", "param2"]
}

test_schema = {
    "type": "object",
    "properties": {
        "request": {
            "oneOf": [
                {
                    "type": "object",
                    "properties": {
                        "topic": one_param,
                        "message": one_param
                    },
                    "required": ["topic", "message"]
                },
                {
                    "type": "object",
                    "properties": {
                        "topic": both_params,
                        "message": both_params
                    },
                    "required": ["topic", "message"]
                }
            ],
        }
    }
}

good_1 = {
    "request": {
        "topic": {
            "param1": "bleep beep",
            "param2": "bloop boop"
        },
        "message": {
            "param1": "bleep beep message",
            "param2": "bloop boop message"
        }
    }
}

good_2 = {
    "request": {
        "topic": {
            "param1": "bleep beep"
        },
        "message": {
            "param1": "bleep beep message"
        }
    }
}

bad_1 = {
    "request": {
        "topic": {
            "param1": "bleep beep",
        },
        "message": {
            "param1": "bleep beep message",
            "param2": "bloop boop message with no matching topic"
        }
    }
}

bad_2 = {
    "request": {
        "topic": {
            "param1": "bleep beep",
            "param2": "bloop boop topic with no matching message"
        },
        "message": {
            "param1": "bleep beep message"
        }
    }
}

validate(good_1, test_schema)  # should validate
validate(good_2, test_schema)  # should validate
validate(bad_1, test_schema)  # should fail
validate(bad_2, test_schema)  # should fail

Upvotes: 2

Views: 425

Answers (1)

Relequestual
Relequestual

Reputation: 12315

With oneOf, each item in the array (a subschema) is applied to the data. If you test each of the individual subschemas in your oneOf, what happens?

You'll find that BOTH are valid!

Your "one_param" schema needs to make sure that including param2 would cause it to fail. You can use additionalProperties to do this...

{
  "type": "object",
  "properties": {
    "param1": {
      "type": "string"
    }
  },
  "required": [
    "param1"
  ],
  "additionalProperties": false
}

I think you were assuming that only properties defined in properties are allowed, but that's not the case, hence the need for you to also define them in required.

You can see this working by trying the schema on https://jsonschema.dev. I've pre-loaded the link with the updated schema and instance.

As an aside, you can use definitions and $ref to avoid repeating subschemas, should you wish to save your schema to a single json file.

Upvotes: 1

Related Questions