Reputation: 25366
I am learning TensorFlow (as well as general deep learning). I am wondering when do we need to break the input training data into batches? And how do we determine the batch size? Is there a rule of thumb? Thanks!
Upvotes: 0
Views: 215
Reputation: 101
Generally Deep Learning algorithms are ran on GPUs which has limited memory and thus a limited number of input data samples (in the algorithm commonly defined as batch size) could be loaded at a time.
In general larger batch size reduces the overall computation time (as the internal matrix multiplications are done in a parallel manner in GPU, thus with large batch sizes the time gets saved in reading/writing gradients and possibly some other operations output).
Another probable benefit of large batch size is: In multi-class classification problems, if the number of classes are large, a larger batch size makes algorithm generalize better(technically avoids over-fitting) over the different classes (while doing this a standard technique is to keep uniform distribution of classes in a batch).
While deciding batch size there are some other factors which comes into play are: learning rate and the type of Optimization method.
I hope this answers your question to certain extent!
Upvotes: 1