user297850
user297850

Reputation: 8015

shuffle in the model.fit of keras

In the model.fit of keras, there is a shuffle parameter,

shuffle: Boolean (whether to shuffle the training data before each epoch) or str (for 'batch'). 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not  None.

Assume the training set is a list with 50000 elements, so the whole list will be randomly permuted before each epoch? Of if the batch size is 250, only the elements belonging to each batch get permuted? What should be the correct understanding?

Upvotes: 32

Views: 54113

Answers (1)

Y. Luo
Y. Luo

Reputation: 5732

It will shuffle your entire dataset (x, y and sample_weight together) first and then make batches according to the batch_size argument you passed to fit.

Edit

As @yuk pointed out in the comment, the code has been changed significantly since 2018. The documentation for the shuffle parameter now seems more clear on its own. You can choose to shuffle the entire training data or just shuffle the batch:

        shuffle: Boolean (whether to shuffle the training data
            before each epoch) or str (for 'batch'). This argument is ignored
            when `x` is a generator. 'batch' is a special option for dealing
            with the limitations of HDF5 data; it shuffles in batch-sized
            chunks. Has no effect when `steps_per_epoch` is not `None`.

Upvotes: 32

Related Questions