How to create a completely (uniformly) random dataset on PyTorch

Question

I need to run some experiments on custom datasets using pytorch. The question is, how can I create a dataset using torch.Dataloader?

I have two lists, one is called Values and has a datapoint tensor at every entry, and the other one is called Labels, that has the corresponding label. What I did is the following:

for i in range(samples):
dataset[i] = [values[i],labels[I]]

So I have a list with datapoint and respective label, and then tried the following:

dataset = torch.tensor(dataset).float()
dataset = torch.utils.data.TensorDataset(dataset)

data_loader = torch.utils.data.DataLoader(dataset=dataset, batch_size=100, shuffle=True, num_workers=4, pin_memory=True)

But, first of all, I get the error "Not a sequence" in the torch.tensor command, and second, I'm not sure this is the right way of creating one. Any suggestion?

Thank you very much!

Shai · Accepted Answer

You do not need to overload DataLoader, but rather create a Dataset for your data.
For instance,

class MyDataset(Dataset):
  def __init__(self):
    super(MyDataset, self).__init__()
    # do stuff here?
    self.values = values
    self.labels = labels

  def __len__(self):
    return len(self.values)  # number of samples in the dataset

  def __getitem__(self, index):
    return self.values[index], self.labels[index]

How to create a completely (uniformly) random dataset on PyTorch

Answers (2)

Related Questions