santoku
santoku

Reputation: 3437

With pytorch DataLoader how to take in two ndarray (data & label)?

I have a training data features in ndarray of shape (100, 400, 3) as it's 100 images of 20x20 with RGB channel and label in shape (100, ). Do I need to combine them into one dataset or how can I pass it to Pytorch dataLoader in order to iterate over image and label later?

What I've tried so far

#turn ndarray of features and labels into tensors
transform = transforms.Compose([transforms.ToPILImage(),
                                transforms.ToTensor()])

Upvotes: 2

Views: 1316

Answers (2)

zihaozhihao
zihaozhihao

Reputation: 4485

As @Shai mentioned, DataLoader requires the input to be the Dataset class or its subclass. One of the simplest subclasses is TensorDataset and you can convert it from ndarray.

import torch
import numpy as np
import torch.utils as utils

train_x = torch.Tensor(np.random.randn(100,400,3))
train_y = torch.Tensor(np.random.randint(0,2,100))

dataset = utils.data.TensorDataset(train_x, train_y)
dataloader = utils.data.DataLoader(dataset)

Upvotes: 2

Shai
Shai

Reputation: 114926

You can convert your data/label ndarrays to torch.tensor and use torch.utils.data.TensorDataset to create a dataset that iterates over your examples.
Once you have a dataset, you can wrap a DataLoader around it to be used for training.

Upvotes: 2

Related Questions