Reputation: 851
I have a csv file for train and test datasets that contains the file location and the label. The head of this data frame is:
df.head()
Out[46]:
file_path label
0 \\images\\29771.png 0
1 \\images\\55201.png 0
2 \\images\\00715.png 1
3 \\images\\33214.png 0
4 \\images\\99841.png 1
I have multiple locations for the file paths, and limited space, so I can't copy them into \0 and \1 folder locations. How can I use this data frame to create a pytorch dataloader and/or dataset object?
Upvotes: 2
Views: 2550
Reputation: 5383
Just write a custom __getitem__
method for your dataset.
class MyData(Dataset):
def __init__(self, df):
self.df = df
def __len__(self):
return self.df.shape[0]
def __getitem__(self, index):
image = load_image(self.df.file_path[index])
label = self.df.label[index]
return image, label
Where load_image
is a function that reads the filename into whatever format you need.
Upvotes: 5