Batch normalization changes depending on batch during test time

Question

Say I am using a batch-size of 64 datapoints. During training I update the exponential moving averages for both mean and variance, and use these averages during test time.

I have two test cases: (1) datapoint-A + 63 other unique datapoints, (2) datapoint-A repeated 64 times

What I expect to happen: During test time, the output for datapoint-A should be the same for both cases, since the average mean and variances are used to normalize.

What is happening in my implementation: The output is different for each of the test cases, i.e., the output for each test example depends on the other examples provided in the batch, due to normalization.

Is my expectation incorrect, or is it correct and I need to focus on debugging my implementation?

lejlot · Accepted Answer

Normalization adjustment should not be performed in the test time. You need to distinguish between train time and test time of your network. During training you fit the normalization, and once it is finished - compute normalization over whole training set (or at least representible batch), then fix it and use the fixed one for prediction phase.

Batch normalization changes depending on batch during test time

Answers (1)

Related Questions