pytorch custom dataset: DataLoader returns a list of tensors rather than tensor of a list

Question

import torch

class Custom_Dataset(torch.utils.data.dataset.Dataset):
    def __init__(self, _dataset):
        self.dataset = _dataset

    def __getitem__(self, index):
        example, target = self.dataset[index]
        return example, target

    def __len__(self):
        return len(self.dataset)

train_data = [([1, 3, 5], 0),
              ([2, 4, 6], 1)]
train_loader = torch.utils.data.DataLoader(dataset=Custom_Dataset(train_data),
                                           batch_size=1,
                                           shuffle=False)

for inputs, targets in train_loader:
    print(inputs)
    print(targets)

I'm defining my training data as [([1, 3, 5], 0), ([2, 4, 6], 1)]: input([1, 3, 5]) paired target (0).

But when I fetch data from data loader, it becomes:

[tensor([1]), tensor([3]), tensor([5])]
tensor([0])

How do I get instead:

tensor([[1],
        [3],
        [5]])
tensor([0])

?

I know torch.stack can do the trick, but can I convert it in my custom dataset class?

Bedir Yilmaz · Accepted Answer

One solution to get the desired input would be using numpy. Below I changed only two lines in your example to make it work.

import torch
import numpy as np

class Custom_Dataset(torch.utils.data.dataset.Dataset):
    def __init__(self, _dataset):
        self.dataset = _dataset

    def __getitem__(self, index):
        example, target = self.dataset[index]
        return np.array(example), target

    def __len__(self):
        return len(self.dataset)

train_data = [([1, 3, 5], 0),
              ([2, 4, 6], 1)]
train_loader = torch.utils.data.DataLoader(dataset=Custom_Dataset(train_data),
                                           batch_size=1,
                                           shuffle=False)

for inputs, targets in train_loader:
    print(inputs)
    print(targets)

Output of this code would be

tensor([[1, 3, 5]])
tensor([0]) 

tensor([[2, 4, 6]])
tensor([1])

But of course, I am assuming that having a row vector or a column vector does not make any difference to you. Otherwise, you might want to check this answer about transposing 1D vectors.

Hope this helps.

pytorch custom dataset: DataLoader returns a list of tensors rather than tensor of a list

Answers (1)

Related Questions