Nusrat Jahan
Nusrat Jahan

Reputation: 51

How to stop data shuffling while training the HuggingFace BERT model?

I want to train a BERT transformer model using the HuggingFace implementation/library. During training, HuggingFace shuffles the training data for each epoch, but I don't want to shuffle the data. For example, if I have 5 training data and the batch size = 2, then I want the training data to be presented as [1, 2], [2, 3], [3, 4] and [4, 5]. I cannot find any resources that show how to disable the default shuffling.

Upvotes: 5

Views: 3591

Answers (1)

Kyle F. Hartzenberg
Kyle F. Hartzenberg

Reputation: 3710

Generally, it is considered good practice to randomise the presentation order of training data for each epoch. If using batches, then this also includes randomly sampling your training data to create the batches. Doing so ensures that the data for each training iteration is independently and identically distributed (i.i.d.) from the previous, which will lead to improved overall performance. If the order of training data is maintained between epochs, then the model will develop a disposition for the particular order presented and will likely not perform well in other circumstances.

For this reason, HuggingFace will not implement the ability to switch off shuffling for the data loader. You can see that it was requested here on their GitHub and denied with the recommendation that if such a feature was desired, then one should override and subclass get_train_dataloader to return a training data loader that doesn't use a sampler or shuffle=True.

Upvotes: 4

Related Questions