Reputation: 8015
In the model.fit
of keras
, there is a shuffle
parameter,
shuffle: Boolean (whether to shuffle the training data before each epoch) or str (for 'batch'). 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None.
Assume the training set is a list with 50000
elements, so the whole list will be randomly permuted before each epoch? Of if the batch size is 250
, only the elements belonging to each batch get permuted? What should be the correct understanding?
Upvotes: 32
Views: 54113
Reputation: 5732
It will shuffle your entire dataset (x
, y
and sample_weight
together) first and then make batches according to the batch_size
argument you passed to fit
.
As @yuk pointed out in the comment, the code has been changed significantly since 2018. The documentation for the shuffle
parameter now seems more clear on its own. You can choose to shuffle the entire training data or just shuffle the batch:
shuffle: Boolean (whether to shuffle the training data
before each epoch) or str (for 'batch'). This argument is ignored
when `x` is a generator. 'batch' is a special option for dealing
with the limitations of HDF5 data; it shuffles in batch-sized
chunks. Has no effect when `steps_per_epoch` is not `None`.
Upvotes: 32