Stringer Bell
Stringer Bell

Reputation: 161

Batch normalization and small mini-batch

I am not completely familiar with batch normalization layers,. As I understand it, it is going to compute normalization at training time using mini-batch statistics.

Do any of you have experience using thesen layers when the minibatch size is very small (for example using 2 or 4 images per iteration for the minibatch size) ? Is there any reason for it not to work efficiently ?

My feeling would be that the statistics is computed on a very small sample at training time, and could negativaly affect the training, what do you think ?

Upvotes: 2

Views: 2814

Answers (1)

Hasnain Raza
Hasnain Raza

Reputation: 681

You are right in your intuition that the samples might be different from the population (mini-batch vs all samples), but this problem was addressed in the batch normalization paper. Specifically, during train time, you find the variance of your samples by dividing with the batch size (N), but during test time you account for this by using the unbiased variance estimate (multiplication by N/(N-1)): Have a look here for a more detailed and easy to understand explanation: Batch Normalization

Upvotes: 1

Related Questions