Edamame
Edamame

Reputation: 25366

TensorFlow: how to determine if we want to break the training dataset into batches

I am learning TensorFlow (as well as general deep learning). I am wondering when do we need to break the input training data into batches? And how do we determine the batch size? Is there a rule of thumb? Thanks!

Upvotes: 0

Views: 215

Answers (1)

chandrakant_k
chandrakant_k

Reputation: 101

Generally Deep Learning algorithms are ran on GPUs which has limited memory and thus a limited number of input data samples (in the algorithm commonly defined as batch size) could be loaded at a time.

In general larger batch size reduces the overall computation time (as the internal matrix multiplications are done in a parallel manner in GPU, thus with large batch sizes the time gets saved in reading/writing gradients and possibly some other operations output).

Another probable benefit of large batch size is: In multi-class classification problems, if the number of classes are large, a larger batch size makes algorithm generalize better(technically avoids over-fitting) over the different classes (while doing this a standard technique is to keep uniform distribution of classes in a batch).

While deciding batch size there are some other factors which comes into play are: learning rate and the type of Optimization method.

I hope this answers your question to certain extent!

Upvotes: 1

Related Questions