Should I shuffle the data to train a neural network using backpropagation?

Question

I want to train a neural network using backpropagation, and I have a data set like this:

Should I shuffle the input data?

Franck Dernoncourt · Accepted Answer

Yes, and it should be shuffled at each iteration, e.g. quote from {1}:

As for any stochastic gradient descent method (including the mini-batch case), it is important for efficiency of the estimator that each example or minibatch be sampled approximately independently. Because random access to memory (or even worse, to disk) is expensive, a good approximation, called incremental gradient (Bertsekas, 2010), is to visit the examples (or mini-batches) in a fixed order corresponding to their order in memory or disk (repeating the examples in the same order on a second epoch, if we are not in the pure online case where each example is visited only once). In this context, it is safer if the examples or mini-batches are first put in a random order (to make sure this is the case, it could be useful to first shuffle the examples). Faster convergence has been observed if the order in which the mini-batches are visited is changed for each epoch, which can be reasonably efficient if the training set holds in computer memory.

{1} Bengio, Yoshua. "Practical recommendations for gradient-based training of deep architectures." Neural Networks: Tricks of the Trade. Springer Berlin Heidelberg, 2012. 437-478.

Should I shuffle the data to train a neural network using backpropagation?

Answers (1)

Related Questions