Reputation: 51
I want to train a BERT transformer model using the HuggingFace
implementation/library. During training, HuggingFace
shuffles the training data for each epoch, but I don't want to shuffle the data. For example, if I have 5 training data and the batch size = 2, then I want the training data to be presented as [1, 2], [2, 3], [3, 4] and [4, 5]. I cannot find any resources that show how to disable the default shuffling.
Upvotes: 5
Views: 3591
Reputation: 3710
Generally, it is considered good practice to randomise the presentation order of training data for each epoch. If using batches, then this also includes randomly sampling your training data to create the batches. Doing so ensures that the data for each training iteration is independently and identically distributed (i.i.d.) from the previous, which will lead to improved overall performance. If the order of training data is maintained between epochs, then the model will develop a disposition for the particular order presented and will likely not perform well in other circumstances.
For this reason, HuggingFace will not implement the ability to switch off shuffling for the data loader. You can see that it was requested here on their GitHub and denied with the recommendation that if such a feature was desired, then one should override and subclass get_train_dataloader
to return a training data loader that doesn't use a sampler or shuffle=True
.
Upvotes: 4