Reputation: 6290
I have a training set consisting of 36 data points. I want to train a neural network on it. I can choose as the batch size for example 1 or 12 or 36 (every number where 36 can divided by).
Of course when I increase the batch size training runtime decreases substantially.
Is there a disadvantage if I choose e.g. 12 as the batch size instead of 1?
Upvotes: 2
Views: 2148
Reputation: 3298
I agree with lejlot. The batchsize is not the problem in your current model building, given the very small data size. Once you move on to larger data that can't fit in memory, then try different batch sizes (say, some powers of 2, i.e. 32, 128, 512,...).
The choice of batch size depends on:
A batch generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to process and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction).
You will have a better intuition after having tried different batch sizes. If your hardware and time allows, have the machine pick the right batch for you (loop through different batch sizes as part of the grid search.
Here are some good answers: one, two.
Upvotes: 1
Reputation: 66825
There are no golden rules for batch sizes. period.
However. Your dataset is extremely tiny, and probably batch size will not matter at all, all your problems will come from lack of data, not any hyperparameters.
Upvotes: 6