Ryan
Ryan

Reputation: 10109

RuntimeError: inconsistent tensor sizes at /pytorch/torch/lib/TH/generic/THTensorMath.c:2864

Im trying to build a dataloader, This is what it looks like

`class WhaleData(Dataset):
def __init__(self, data_file, root_dir , transform = None):
    self.csv_file = pd.read_csv(data_file)
    self.root_dir = root_dir
    self.transform = transforms.Resize(224)

def __len__(self):
    return len(os.listdir(self.root_dir))

def __getitem__(self, index):
    image = os.path.join(self.root_dir, self.csv_file['Image'][index])
    image = Image.open(image)
    image = self.transform(image)
    image = np.array(image)
    label  = self.csv_file['Image'][index]
    sample = {'image': image, 'label':label}
    return sample

trainset  = WhaleData(data_file = '/mnt/55-91e8-b2383e89165f/Ryan/1234/train.csv', 
     root_dir = '/mnt/4d55-91e8-b2383e89165f/Ryan/1234/train')
train_loader = torch.utils.data.DataLoader(trainset , batch_size = 4, shuffle =True,num_workers= 2)
for i, batch in enumerate(train_loader):
      (i, batch)

When i try running this block of code, i get this error,I do get the nature of the error that all my images may not be of the same shape,and my images are not all of the same shape ,But if im not wrong the error should only arise when i feed them to the network because the images are all of different shapes,But why is it throwing an errror here? Any suggestions on where i might have gone wrong will be extremly helpful, I would be happy to provide any extra information if needed,

Thanks

RuntimeError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 116, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 116, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 105, in default_collate
return torch.stack([torch.from_numpy(b) for b in batch], 0)
  File "/usr/local/lib/python3.5/dist-packages/torch/functional.py", line 64, in stack
    return torch.cat(inputs, dim)
RuntimeError: inconsistent tensor sizes at /pytorch/torch/lib/TH/generic    /THTensorMath.c:2864

Upvotes: 0

Views: 1185

Answers (1)

benjaminplanche
benjaminplanche

Reputation: 15119

The error appears when PyTorch tries to stack together the images into a single batch tensor (cf. torch.stack([torch.from_numpy(b) for b in batch], 0) from your trace). As you mentioned, since the images have different shape, the stacking fails (i.e. a tensor (B, H, W) can only be created by stacking B tensors if all these tensors have for shape (H, W)).


Note: I'm not fully sure, but setting batch_size=1 for torch.utils.data.DataLoader(...) may remove this particular error, as it probably won't need calling torch.stack() anymore).

Upvotes: 1

Related Questions