What are the learnable parameters in batch normalization?

Question

There are four variables in question: gamma, beta, moving average of mean, moving average of variance.

Is it necessary to snapshot the moving averages, and load them in the testing time?

A better question:

For this implementation of batch normalization in tensorflow, do I need to transfer the batch-mean and batch-var from training time to testing time? If so, how can I achieve that in tensorflow?

dga · Accepted Answer

Yes - for any use of batch normalization, you train by normalizing based upon the statistics of the single batch, but then you run inference by using the long-term average of the statistics.

You should save a copy of your mean and variance-holding-variables and restore it when you're doing tests.

There shouldn't be any magic required: They're just variables that will be saved and restored when you use the Saver.

In the specific implementation you reference, the documentation for tf.train.ExponentialMovingAverage has a specific example of how to save and restore the moving averages for training and inference, respectively.

What are the learnable parameters in batch normalization?

Answers (1)

Related Questions