What is different between DataLoader and DataLoader2 in PyTorch?

Question

I developed a custom dataset by using the PyTorch dataset class. The code is like that:

class CustomDataset(torch.utils.data.Dataset):

    def __init__(self, root_path, transform=None):
        self.path = root_path
        self.mean = mean
        self.std = std
        self.transform = transform
        self.images = []
        self.masks = []

        for add in os.listdir(self.path):
            # Some script to load file from directory and appending address to relative array
            ...

        self.masks.sort()
        self.images.sort()

    def __len__(self):
        return len(self.images)

    def __getitem__(self, item):
        image_address = self.images[item]
        mask_address = self.masks[item]



        if self.transform is not None:
            augment = self.transform(image=np.asarray(Image.open(image_address, 'r', None)),
                                     mask=np.asarray(Image.open(mask_address, 'r', None)))
            image = Image.fromarray(augment['image'])
            mask = augment['mask']

        if self.transform is None:
            image = np.asarray(Image.open(image_address, 'r', None))
            mask = np.asarray(Image.open(mask_address, 'r', None))

        # Handle Augmentation here

        return image, mask

Then I created an object from this class and passed it to torch.utils.data.DataLoader. Although this works well with DataLoader but with torch.utils.data.DataLoader2 I got a problem. The error is this:

dataloader = torch.utils.data.DataLoader2(dataset=dataset, batch_size=2, pin_memory=True, num_workers=4)

Exception: thread parallelism mode is not supported for old DataSets

My question is why DataLoader2 module was added to PyTorch what is different with DataLoader and what are its benefits?

PyTorch Version: 1.10.1

Ivan · Accepted Answer

You should definitely not use it DataLoader2.

torch.utils.data.DataLoader2 (actually torch.utils.data.dataloader_experimental.DataLoader2) was added as an experimental "feature" as a future replacement for DataLoader. It is defined here. Currently, it is only accessible on the master branch (unstable) and is of course not documented on the official pages.

What is different between DataLoader and DataLoader2 in PyTorch?

Answers (1)

Related Questions