Reputation: 25
I am writing code, which loads the data of a JSON file and parses it using Pydantic.
Here is the Python code:
import json
import pydantic
from typing import Optional, List
class Car(pydantic.BaseModel):
manufacturer: str
model: str
date_of_manufacture: str
date_of_sale: str
number_plate: str
price: float
type_of_fuel: Optional[str]
location_of_sale: Optional[str]
def load_data() -> None:
with open("./data.json") as file:
data = json.load(file)
cars: List[Car] = [Car(**item) for item in data]
print(cars[0])
if __name__ == "__main__":
load_data()
And here is the JSON data:
[
{
"manufacturer": "BMW",
"model": "i8",
"date_of_manufacture": "14/06/2021",
"date_of_sale": "19/11/2022",
"number_plate": "ND21WHP",
"price": "100,000",
"type_of_fuel": "electric",
"location_of_sale": "Leicester, England"
},
{
"manufacturer": "Audi",
"model": "TT RS",
"date_of_manufacture": "22/02/2019",
"date_of_sale": "12/08/2021",
"number_plate": "LR69FOW",
"price": "67,000",
"type_of_fuel": "petrol",
"location_of_sale": "Manchester, England"
}
]
And this is the error I am getting:
File "pydantic\main.py", line 342, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for Car price value is not a valid float (type=type_error.float)
I have tried adding .00
to the end of the price strings but I get the same error.
Upvotes: 1
Views: 4196
Reputation: 18663
The problem comes from the fact that the default Pydantic validator for float
simply tries to coerce the string value to float
(as @Paul mentioned). And float("100,000")
leads to a ValueError
.
I am surprised no one suggested this, but if you don't control the source JSON data, you can easily solve this issue by writing your own little validator to properly format the string (or parse the number properly yourself):
from pydantic import BaseModel, validator
class Car(BaseModel):
manufacturer: str
model: str
date_of_manufacture: str
date_of_sale: str
number_plate: str
price: float
type_of_fuel: Optional[str]
location_of_sale: Optional[str]
@validator("price", pre=True)
def adjust_number_format(cls, v: object) -> object:
if isinstance(v, str):
return v.replace(",", "")
return v
The pre=True
is important to make the adjustment before the default field validator receives the value. I purposefully did it like this to show that you don't need to convert the str
to a float
yourself, but you could of course do that too:
...
@validator("price", pre=True)
def parse_number(cls, v: object) -> object:
if isinstance(v, str):
return float(v.replace(",", ""))
return v
Both of these work and require no changes in the JSON document.
Finally, if you have (or anticipate to have in the future) multiple number-like fields and know that all of them may cause such problems with weirdly formatted strings, you could generalize that validator like this: (different class for demo pruposes)
from pydantic import BaseModel, validator
from pydantic.fields import ModelField
class Car2(BaseModel):
model: str
price: float
year: int
numbers: list[float]
@validator("*", pre=True, each_item=True)
def format_number_string(cls, v: object, field: ModelField) -> object:
if issubclass(field.type_, (float, int)) and isinstance(v, str):
return v.replace(",", "")
return v
if __name__ == "__main__":
car = Car2.parse_obj({
"model": "foo",
"price": "100,000",
"year": "2,010",
"numbers": ["1", "3.14", "10,000"]
})
print(car) # model='foo' price=100000.0 year=2010 numbers=[1.0, 3.14, 10000.0]
Upvotes: 3
Reputation: 326
You could also change the decimal comma ,
to a _
and keep the string.
Pydantic is taking care of the str to float conversion then.
Upvotes: 1
Reputation: 51
You need to remove the quotes around the numbers since they are being interpreted as strings.
"price": "100,000"
should be:
"price": 100000
Upvotes: 0