Reputation: 70

Use @validator in pydantic model for date comparison

I'm using Pydantic and i'm trying to compare than date_from is inferior to date_to (both of these fields are optional) in my Pydantic BaseModel and return a 422 error if it's not the case.

I tried to use to method presented in the StackOverflow question: pydantic Multi-field comparison - but I didn't get any success with it, I alway got a 200 succeed.

Here's my model:

class MyModel(BaseModel):
    startFrom: Optional[date] = Field(
        ...,
        description='XXX',
        example='2023-03-15',
    )

    StartTo: Optional[date] = Field(
        ...,
        description='XXX',
        example='2023-03-31',
    )

    otherData: str = Field(
        ...,
        description='XXX',
        example='XX',
    )

    @validator('startFrom')
    def date_order(cls, startFrom, values, **kwargs):
        if ('StartTo' not in values):
            return startFrom

        if (startFrom > values['StartTo']):
            raise ValueError('startFrom must be inferior to StartTo.')
        return startFrom

I don't really know what I do wrong.

Thanks to everyone for you help :)

Upvotes: 3

Answers (2)

NeilG

Reputation: 4160

Use Pydantic to validate incoming start and end dates

I had a lot of Pydantic models which often included *_start_date and *_end_date fields, e.g. plan_start_date and plan_end_date. It's certainly going to be convenient and reliable to ensure Pydantic normalises these when the request is received.

What kind of validation to perform

The following code ensures that the "start" or "from" dates are always less than or equal to the "end" or "to" dates. Having certainty about that in your service greatly simplifies date arithmetic and avoids repeated validation.

If the client is sending the dates the wrong way round, I think it's worth returning HTTP 400 for that. It's a mistake and there could be other mistakes in the request. But if you want to tolerate that as well, and swap them around, the following method still provides one DRY location to do that.

Derive a new base class from BaseModel

The following approach sub-classes pydantic.BaseModel. This new base class automatically detects "from" and "to" date fields (by naming convention) and applies the validation, returning HTTP 400 on failure.

To use, just derive the models that need it from the new base class instead of pydantic.BaseModel, and it all happens.

The new base class

I'm not going to follow the inconsistent and unconventional field naming in the question, because I don't want to perpetuate bad practice, but I'll try to use similar PEP-8 equivalent names. The date field detection relies on using particular field name suffixes.

I have a range of custom exceptions for identifying HTTP Server Error (for instance) and others. To simplify I haven't included them here. It may help to refer to Pydantic's conventions about handling errors.

import pydantic


class DurationModel(pydantic.BaseModel):
    """Apply *_from_date and *_to_date field order validation."""

    @pydantic.model_validator(mode="after")
    def validate_date_order(self) -> "DurationModel":
        """Validate *_from_date does not come after *_to_date."""

        from_date_field = [field for field in self.__dict__ if field.endswith("from_date")]
        to_date_field = [field for field in self.__dict__ if field.endswith("to_date")]

        if not (from_date_field and to_date_field):  # Raise HTTP 500
            msg = self.__class__.__name__ + " does not contain 'from' and 'to' date fields"  # pragma: nocover
            raise AttributeError(msg)  # pragma: nocover: not expecting unit tests for these lines

        if (getattr(self, from_date_field[0]) or datetime.date.min) > (getattr(self, to_date_field[0]) or datetime.date.max):
            msg = self.__class__.__name__ + " 'from' date after 'to' date"
            raise ValueError(msg)  # This should be HTTP 400

        return self

DurationModel uses a Pydantic "after" mode model validator. This is a validator that runs after the standard Pydantic validators, so the date fields are already datetime.date instances. It's also a whole model validator, so it has access to all the fields in the model, not just one of them.

How to use

I will follow "late model" Python and Pydantic 2 approaches. Thus I will not be importing typing.Optional.

The DurationModel supports "optional" datetime.date fields for which I use datetime.date | None. I've provided a default of None and the validator should handle None dates. Of course it can be more strict if needed.

Again, I can't bring myself to perpetuate the "start from" and "start to" naming as this is inherently contradictory, but I've done my best to fit in. I've skipped the Field descriptions to keep it brief. Those are easily added back in if needed.

import datetime

class MyModel(DurationModel):
    from_date: datetime.date | None = None
    to_date: datetime.date | None = None
    other: str

my_data = {"from_date": "2023-01-01", "to_date": "2023-12-31", "other": "stuff"}
my_model = MyModel(**my_data)

Some examples

Here are some quick usage examples of how it works.

my_data = {"from_date": "2023-01-01", "to_date": "2023-12-31", "other": "stuff"}
my_model = MyModel(**my_data)
# from_date=datetime.date(2023, 1, 1)
# to_date=datetime.date(2023, 12, 31)
# other='stuff'

my_data = {"from_date": "2024-01-01", "to_date": "2023-12-31", "other": "stuff"}
my_model = MyModel(**my_data)
# Value error, MyModel 'from' date after 'to' date
# [type=value_error, input_value={'from_date': '2024-01-01...}, input_type=dict]

my_data = {"from_date": "2024-01-01", "to_date": "", "other": "stuff"}
my_model = MyModel(**my_data)
# Input should be a valid date or datetime, input is too short
# [type=date_from_datetime_parsing, input_value='', input_type=str]

my_data = {"from_date": "2024-01-01", "to_date": None, "other": "stuff"}
my_model = MyModel(**my_data)
# from_date=datetime.date(2024, 1, 1)
# to_date=None
# other='stuff'

Simple application to question

And at the risk of making this answer too long, I've applied the principles above to implement an answer that fits just the simple use case where a generic base class is overkill.

import pydantic
import datetime


class MyModel(pydantic.BaseModel):
    from_date: datetime.date | None = None
    to_date: datetime.date | None = None
    other_data: str

    @pydantic.model_validator(mode="after")
    def validate_date_order(self) -> "MyModel":
        if (self.from_date or datetime.date.min) > (self.to_date or datetime.date.max):
            msg = "MyModel 'from_date' comes after 'to_date'"
            raise ValueError(msg)
        return self

my_data = {"from_date": "2023-01-01", "to_date": "2023-12-31", "other_data": "stuff"}
my_model = MyModel(**my_data)
print(my_model)

my_data = {"from_date": "2024-01-01", "to_date": None, "other_data": "stuff"}
my_model = MyModel(**my_data)
print(my_model)

my_data = {"from_date": "2024-01-01", "to_date": "2023-12-31", "other_data": "stuff"}
my_model = MyModel(**my_data)
# Value error, MyModel 'from_date' comes after 'to_date'

Upvotes: 0

Samuel Colvin

Reputation: 13339

You need to change the order of fields so startFrom comes after startTo, review the docs on order of fields.