Reputation: 145
I developed a custom dataset by using the PyTorch dataset class. The code is like that:
class CustomDataset(torch.utils.data.Dataset):
def __init__(self, root_path, transform=None):
self.path = root_path
self.mean = mean
self.std = std
self.transform = transform
self.images = []
self.masks = []
for add in os.listdir(self.path):
# Some script to load file from directory and appending address to relative array
...
self.masks.sort()
self.images.sort()
def __len__(self):
return len(self.images)
def __getitem__(self, item):
image_address = self.images[item]
mask_address = self.masks[item]
if self.transform is not None:
augment = self.transform(image=np.asarray(Image.open(image_address, 'r', None)),
mask=np.asarray(Image.open(mask_address, 'r', None)))
image = Image.fromarray(augment['image'])
mask = augment['mask']
if self.transform is None:
image = np.asarray(Image.open(image_address, 'r', None))
mask = np.asarray(Image.open(mask_address, 'r', None))
# Handle Augmentation here
return image, mask
Then I created an object from this class and passed it to torch.utils.data.DataLoader. Although this works well with DataLoader but with torch.utils.data.DataLoader2 I got a problem. The error is this:
dataloader = torch.utils.data.DataLoader2(dataset=dataset, batch_size=2, pin_memory=True, num_workers=4)
Exception: thread parallelism mode is not supported for old DataSets
My question is why DataLoader2 module was added to PyTorch what is different with DataLoader and what are its benefits?
PyTorch Version: 1.10.1
Upvotes: 6
Views: 3244
Reputation: 40628
You should definitely not use it DataLoader2
.
torch.utils.data.DataLoader2
(actually torch.utils.data.dataloader_experimental.DataLoader2
)
was added as an experimental "feature" as a future replacement for DataLoader
. It is defined here. Currently, it is only accessible on the master branch (unstable) and is of course not documented on the official pages.
Upvotes: 3