What should __len__ be for PyTorch when generating unlimited data?

Question

Say I am trying to use PyTorch to learn the equation y = 2x and I want to generate an unlimited amount of data to train my model with. I am supposed to provide a __len__ function. Here's an example below. What should it be in this case? How do I specify the number of mini-batch iterations per epoch? Do I just set a number arbitrarily?

import numpy as np
from torch.utils.data import Dataset

class UnlimitedData(Dataset):
    def __init__(self):
        pass
    
    def __getitem__(self, index):
        x = np.random.randint(1,10)
        y = 2 * x
        return x, y
    
    def __len__(self):
        return 1000000 # This works but is not correct

What should len be for PyTorch when generating unlimited data?

Answers (1)

Related Questions