Monica Heddneck
Monica Heddneck

Reputation: 3115

DataLoader not randomly sampling in PyTorch

My DataLoader is returning me the same image with each epoch. My model is only looking at the same single image (indexed '0') each time (batch size is 1...although nothing changes with different batch sizes, anyways).

Here's my dataset, stripped down to the important bits:

class MyDataset(Dataset):

    def __init__(self, path, loader=pil_loader):
        self.path = path
        self.images = os.listdir(path)

    def __getitem__(self, index):
        image = self.images[index]

    . . .

And here's the DataSet:

train_ds = MyDataset('/data')

And here's my sampler:

train_sampler = RandomSampler(train_ds)

And here's my DataLoader:

train_dl = DataLoader(train_ds, batch_size=1, sampler=train_sampler)

I'm not sure why it is returning me the same image each time, during training.

Do I have RandomSampler incompletely set up? Or maybe I wrote the __getitem__ incorrectly? I can't figure it out.

Upvotes: 3

Views: 3747

Answers (1)

Monica Heddneck
Monica Heddneck

Reputation: 3115

Aha. Well, if anyone ends up here with the same issue, I figured out what it is and maybe this will help.

My definition of __len__ was wrong.

I guess the random sampler depends on how you've set up the length method.

Mine was temporarily mocked up as

def __len__(self):
    return len(0)

instead of something real, like:

def __len__(self):
    return len(self.images)

Upvotes: 5

Related Questions