Mattia Surricchio
Mattia Surricchio

Reputation: 1608

Pydantic perform full validation and don't stop at the first error

Is there a way in Pydatic to perform the full validation of my classes? And return all the possible errors?

It seems that the standard behaviour blocks the validation at the first encountered error.

As an example:

from pydantic import BaseModel

class Salary(BaseModel):
    gross: int
    net: int
    tax: int

class Employee(BaseModel):
    name: str
    age: int
    salary: Salary

salary = Salary(gross = "hello", net = 1000, tax = 10)
employee= Employee(name = "Mattia", age = "hello", Salary=salary)

This code works fine and returns the validation error:

pydantic.error_wrappers.ValidationError: 1 validation error for Salary
gross
  value is not a valid integer (type=type_error.integer)

However, it is not catching the second validation error on the age field. In a real bugfix scenario, I would need to fix the first validation error, re-run everything again, and only at that point I would discover the second error on age.

Is there a way to perform the full validation in pydantic? So validate everything and return ALL the validation errors? (so basically, do not stop at the first error met)

Upvotes: 5

Views: 2260

Answers (2)

import pandas as pd
from pydantic import BaseModel, Field, ValidationError

class SampleModel(BaseModel):
    name: str = Field(..., min_length=3, max_length=50)
    age: int = Field(..., gt=0, le=100)
    email: str 

data = {
    'name': ['John', 'Alice', 'Bob'],
    'age': [120, 130, 30],
    'email': ['john@email', '[email protected]', None]
}
df = pd.DataFrame(data)

# Function to validate DataFrame rows
def validate_dataframe(df):
    errors = []
    for index, row in df.iterrows():
        try:
            sample_instance = SampleModel(**row)
        except ValidationError as e:
            error_messages = e.errors()
            errors.append({'index': index, 'errors': error_messages})
    return errors


validation_errors = validate_dataframe(df)

print(validation_errors)

This seemed to work for those who want to do the above. Is there a better solution?

Upvotes: 0

Daniil Fajnberg
Daniil Fajnberg

Reputation: 18388

What you are describing is not Pydantic-specific behavior. This is how exceptions in Python work. As soon as one is raised (and is not caught somewhere up the stack), execution stops.

Validation is triggered, when you attempt to initialize a Salary instance. Failed validation triggers the ValidationError. The Python interpreter doesn't even begin executing the line, where you want to initialize an Employee.

Pydantic is actually way nicer in this regard than it could be. If you pass more than one invalid value in the same initialization, the ValidationError will contain info about about all of them. Like this examle:

...
salary = Salary(gross="hello", net="foo", tax=10)

The error message will look like this:

ValidationError: 2 validation errors for Salary
gross
  value is not a valid integer (type=type_error.integer)
net
  value is not a valid integer (type=type_error.integer)

What you'll have to do, if you want to postpone raising errors, is wrap the initialization in a try-block and upon catching an error, you could for example add it to a list to be processed later.

In your example, this will not work because you want to use salary later on. In that case you could just initialize the Employee like this:

...
employee = Employee(
    name="Mattia",
    age="hello",
    salary={"gross": "hello", "net": 100, "tax": 10}
)

Which would also give you:

ValidationError: 2 validation errors for Employee
age
  value is not a valid integer (type=type_error.integer)
salary -> gross
  value is not a valid integer (type=type_error.integer)

Hope this helps.

Upvotes: 3

Related Questions