CopyOfA
CopyOfA

Reputation: 851

Pytorch dataloader from csv of file paths and labels

I have a csv file for train and test datasets that contains the file location and the label. The head of this data frame is:

df.head()
Out[46]: 
             file_path  label
0  \\images\\29771.png      0
1  \\images\\55201.png      0
2  \\images\\00715.png      1
3  \\images\\33214.png      0
4  \\images\\99841.png      1

I have multiple locations for the file paths, and limited space, so I can't copy them into \0 and \1 folder locations. How can I use this data frame to create a pytorch dataloader and/or dataset object?

Upvotes: 2

Views: 2550

Answers (1)

Karl
Karl

Reputation: 5383

Just write a custom __getitem__ method for your dataset.

class MyData(Dataset):
    def __init__(self, df):
        self.df = df

    def __len__(self):
        return self.df.shape[0]

    def __getitem__(self, index):
        image = load_image(self.df.file_path[index])
        label = self.df.label[index]

        return image, label

Where load_image is a function that reads the filename into whatever format you need.

Upvotes: 5

Related Questions