rangarajan
rangarajan

Reputation: 191

How to assign a value of different type to the field of a Pydantic model after validation?

I had a csv file that has youtube url and its timestamps.

https://www.youtube.com/watch?v=dsnLcaNhXd6o,0:13-0:20;0:25-0:31;0:36-0:40
https://www.youtube.com/watch?v=d8InLcaNhXd6o,0:43-0:52;0:56-1:07
https://www.youtube.com/watch?v=Inji8LcaNhXd6o,0:13-0:20;0:25-0:31;0:36-0:40;0:43-0:52;0:56-1:07;1:15-1:25;1:28-1:40

I need to convert the csv file into a pydantic object so that I can validate the csv file and pass it to perform certain process.

with open(csv_file, mode ='r') as file:
        csvFile = csv.reader(file)
        csvList = list(enumerate(csvFile))

I'm having the following Pydantic models:

class TimeStamp(BaseModel):
    start_min: int
    start_sec: int
    end_min: int
    end_sec: int

class VideoDetail(BaseModel):
    row_index: int
    url: str
    timestamps: List[TimeStamp]

class VideoList(BaseModel):
    entry: List[VideoDetail]

Now I need to pass the csvList to VideoList model and perform some validations and get a VideoList object.

First, the list(enumerate(csvFile)) will return a list of tuples with row index and row

example:

csvList = list(enumerate(csvFile))
print(csvList)

output:

[
(0, "https://www.youtube.com/watch?v=dsnLcaNhXd6o","0:13-0:20;0:25-0:31;0:36-0:40"),
(1, "https://www.youtube.com/watch?v=d8InLcaNhXd6o","0:43-0:52;0:56-1:07"),
(2, "https://www.youtube.com/watch?v=d8InLcaNhXd6o","0:43-0:52;0:56-1:07")
]

Now, when I pass the csvList to VideoList model, the timestamp will be passed as a string. But how can I pass it into a list of TimeStamp objects?

I tried to add a validator to the timestamp field in the VideoDetail model and split the string into a list of timestamps then return it. But it won't work as it will throw an error since, the type of the timestamp does not match.

Upvotes: 0

Views: 1303

Answers (1)

Aseem Ahir
Aseem Ahir

Reputation: 743

Basically the idea is that you will have to split the timestamp string into pieces to feed into the individual variables of the pydantic model : TimeStamp

I am using a validator function to do the same. The pre=True in validator ensures that this function is run before the values are assigned. In the validator function:-

  1. The timestamp string (ex 0:43-0:52;0:56-1:07) is first split by ; to get list of strings of timestamps.
  2. Then it loops through every such timestamp-string (ex 0:43-0:52) and splits each by - to get start time and end time
  3. Lastly, it splits each start time and end time (ex 0:43) by :, converts each to integer, and adds to the list

(I have used dictionary instead of tuples. You can use tuples)

class TimeStamp(BaseModel):
    start_min: int
    start_sec: int
    end_min: int
    end_sec: int

class VideoDetail(BaseModel):
    row_index: int
    url: str
    timestamps: List[TimeStamp]
        
    @validator("timestamps", pre=True)
    def createTimestamps(cls, value):
        timestampslist = []
        if isinstance(value, str):
            timestamplist_str = value.split(";")
            for eachTimestamp in timestamplist_str:
                start_time_str, end_time_str = eachTimestamp.split("-")
                t = TimeStamp(start_min=int(start_time_str.split(":")[0]),
                              start_sec = int(start_time_str.split(":")[1]),
                              end_min=int(end_time_str.split(":")[0]),
                              end_sec = int(end_time_str.split(":")[1]))
                timestampslist.append(t)
        return timestampslist

class VideoList(BaseModel):
    entry: List[VideoDetail]
        
csvList = [
    {"row_index":0, "url": "https://www.youtube.com/watch?v=dsnLcaNhXd6o", "timestamps":"0:13-0:20;0:25-0:31;0:36-0:40"},
    {"row_index":0, "url": "https://www.youtube.com/watch?v=wcsnLcad6d", "timestamps":"0:13-0:20;0:25-0:31;0:36-0:40"},
    {"row_index":0, "url": "https://www.youtube.com/watch?v=LcdshXe6o", "timestamps":"0:13-0:20;0:25-0:31;0:36-0:40"},
]

vs = VideoList(entry=csvList)

Upvotes: 0

Related Questions