rts
rts

Reputation: 31

Load csv and Image dataset in pytorch

I am doing image classification with PyTorch. I have a separate Images folder and train and test csv file with images ids and labels . I don’t have any an idea about how to combine those images and ID and converting into tensors.

  1. train.csv : contains all ID of Image like 4325.jpg, 2345.jpg,…so on and contains Labels like cat,dog.
  2. Image_data : contains all the images of with ID name.

Upvotes: 3

Views: 8386

Answers (1)

Mitiku
Mitiku

Reputation: 5412

You can create custom dataset class by inherting pytorch's torch.utils.data.Dataset.

The assumption for the following custom dataset class is

  • csv file format is

filename label
4325.jpg cat
2345.jpg dog
  • All images are inside images folder.
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, csv_path, images_folder, transform = None):
        self.df = pd.read_csv(csv_path)
        self.images_folder = images_folder
        self.transform = transform
        self.class2index = {"cat":0, "dog":1}

    def __len__(self):
        return len(self.df)
    def __getitem__(self, index):
        filename = self.df[index, "FILENAME"]
        label = self.class2index[self.df[index, "LABEL"]]
        image = PIL.Image.open(os.path.join(self.images_folder, filename))
        if self.transform is not None:
            image = self.transform(image)
        return image, label
        

Now you can use this class to load the training and test dataset using both csv file and image folder.


train_dataset = CustomDataset("path - to - train.csv", "path - to - images - folder"  )
test_dataset = CustomDataset("path - to - test.csv", "path - to - images - folder"  )


image, label = train_dataset[0]

Upvotes: 8

Related Questions