Shuffling batches of data in training neural networks

Question

I have 6.5 GB worth of training data to be used for my GRU network. I intend to split the training time, i.e. pause and resume training since I am using a laptop computer. I'm assuming that it will take me days to train my neural net using the whole 6.5 GB, so, I'll be pausing the training and then resume again at some other time.

Here's my question. If I will shuffle the batches of training data, will the neural net remember which data has been used already for training or not?

Please note that I'm using the global_step parameter of tf.train.Saver().save.

Thank you very much in advance!

Eliethesaiyan · Accepted Answer

I would advise you to save your model at certain epoch,lets say you have 80 epochs,it would be wise to save your model at each 20epochs(20,40,60)but again this will depend on the capacity of your laptop,the reason is that at one epoch,your network will have seen all the datasets(training set).If your whole dataset can't be processed in a single epoch,i would advise you to randomly sample from your whole dataset what will be the training set.The whole point of shuffling is to let the network do some generalization over the whole dataset and it is usually done on either batch or selecting training dataset,or starting a new training epoch.As for your main question,its definetly ok to shuffle bacthes when training and resuming.Shuffling batches ensures that the gradients are calculated along the batch instead of over one image

Shuffling batches of data in training neural networks

Answers (1)

Related Questions