Sanket Wagh
Sanket Wagh

Reputation: 166

Validations showing invalid details in response for 422 Unprocessable entity (FastAPI , Pydantic )

class Foo(BaseModel):
   template : str
   body: FooBody

class Bar(BaseModel):
   template : str
   body: BarBody

class Xyz(BaseModel):
   template : str
   body: XyzBody

@router.post("/something", status_code=status.HTTP_200_OK)
async def get_pdf(
    request: Request,
    request_body: Union[Foo, Bar, Xyz],
):

In the above code snippet my body can be of three types (any one) using Union

The code works perfect for the given body types. However if single field is missing the 422 validation error provides lot of missing fields even if only one field is missing.

What could be the cause of this. or I am I using Union incorrectly ?

My Goal is to only allow the mentioned BaseModel (Foo, Bar, Xyz) and if my request has detect Foo and certain field missing in the request then it should only show that filed instead of showing all the field in Bar, Xyz and the one missing in Foo

Minimum Reproducible Example

from typing import Union

from fastapi import FastAPI

app = FastAPI(debug=True)

from fastapi import APIRouter, status
from pydantic import BaseModel


class FooBody(BaseModel):
    foo1: str
    foo2: int
    foo3: str

class Foo(BaseModel):
    temp: str
    body: FooBody

class BarBody(BaseModel):
    bar1: str
    bar2: int
    bar3: str

class Bar(BaseModel):
    temp: str
    body: BarBody

class XyzBody(BaseModel):
    xyz1: str
    xyz2: int
    xyz3: str

class Xyz(BaseModel):
    temp: str
    body: XyzBody

@app.get("/type", status_code=status.HTTP_200_OK)
def health(response_body: Union[Foo, Bar, Xyz]):
    return response_body

so if I use

{
    "temp": "xyz",
    "body": {
        "foo1": "ok",
        "foo2": 1,
        "foo3": "2"
    }
}

It works as expected, but if I miss one parameter say foo3 in request body I don't get the validation error saying foo3 is missing instead I get

{
    "detail": [
        {
            "loc": [
                "body",
                "body",
                "foo3"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "bar1"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "bar2"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "bar3"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "xyz1"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "xyz2"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        },
        {
            "loc": [
                "body",
                "body",
                "xyz3"
            ],
            "msg": "field required",
            "type": "value_error.missing"
        }
    ]
}

The entire class parameters mentioned in the Union.

Iam I using Union Wrong ?

What I neeed is like it should accept body of only of classes which I add I it detects its class Foo then it should only check for validations in the class Foo and not the entire thing.

Upvotes: 1

Views: 921

Answers (1)

Daniil Fajnberg
Daniil Fajnberg

Reputation: 18458

I will try to rephrase and condense your question because it contains a lot of code that is entirely unrelated to the actual underlying problem of validation that you came across.


MRE

Here is what the problem actually boils down to:

from pydantic import BaseModel, ValidationError


class Foo(BaseModel):
    foo1: str
    foo2: int


class Bar(BaseModel):
    bar1: bool
    bar2: bytes


class Model(BaseModel):
    data: Foo | Bar


def test(model: type[BaseModel], data: dict[str, object]) -> None:
    try:
        instance = model.parse_obj({"data": data})
    except ValidationError as error:
        print(error.json(indent=4))
    else:
        print(instance.json(indent=4))


if __name__ == "__main__":
    incomplete_test_data = {"foo1": "a"}
    valid_test_data = incomplete_test_data | {"foo2": 1}
    test(Model, valid_test_data)
    test(Model, incomplete_test_data)

The output of the first test call is as expected:

{
    "data": {
        "foo1": "a",
        "foo2": 1
    }
}

But the second one gives us the following:

[
    {
        "loc": [
            "data",
            "foo2"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    },
    {
        "loc": [
            "data",
            "bar1"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    },
    {
        "loc": [
            "data",
            "bar2"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    }
]

This is not what we want. We want the validation error caused by the second call to recognize that validation should be done via the Foo model and only foo2 is missing, so that it contains only one actual error:

[
    {
        "loc": [
            "data",
            "foo2"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    }
]

How can this be accomplished?


Answer

This is exactly what discriminated unions are for. They are also part of the OpenAPI specifiation. However, as the documentations show, a discriminated union requires a discriminator field to be added to each of the models in that union.

Here is how this could look:

from typing import Literal
from pydantic import BaseModel, Field, ValidationError


...


class FooDisc(BaseModel):
    data_type: Literal["foo"]
    foo1: str
    foo2: int


class BarDisc(BaseModel):
    data_type: Literal["bar"]
    bar1: bool
    bar2: bytes


class ModelDisc(BaseModel):
    data: FooDisc | BarDisc = Field(..., discriminator="data_type")


if __name__ == "__main__":
    ...
    incomplete_test_data = {
        "data_type": "foo",
        "foo1": "a",
    }
    valid_test_data = incomplete_test_data | {"foo2": 1}
    test(ModelDisc, valid_test_data)
    test(ModelDisc, incomplete_test_data)

Now the output of the first test call is this:

{
    "data": {
        "data_type": "foo",
        "foo1": "a",
        "foo2": 1
    }
}

And the second call gives just the following:

[
    {
        "loc": [
            "data",
            "FooDisc",
            "foo2"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    }
]

As the linked Pydantic docs show, more models than two and more complex/nested constructs using discriminated unions are possible as well.

While the added field may seem annoying, you need to realize that this is the only generally reliable way to convey, which model/schema to use. If you want to get fancy with your specific situation and not use discriminators, you can always write your own validator with pre=True on the model containing the (regular) union and try to parse the data for that field inside that validator based on (for example) keys that you find in the dictionary passed there. But I would advise against this because it introduces a lot of room for errors. Discriminated unions have been introduced for a reason and this problem is exactly that reason.

Upvotes: 1

Related Questions