Jenia Golbstein
Jenia Golbstein

Reputation: 374

Pytorch Batchnorm layer different from Keras Batchnorm

I'm trying to copy pre-trained BN weights from a pytorch model to its equivalent Keras model but I keep getting different outputs.

I read Keras and Pytorch BN documentation and I think that the difference lies in the way they calculate the "mean" and "var".

Pytorch:

The mean and standard-deviation are calculated per-dimension over the mini-batches

source: Pytorch BatchNorm

Thus, they average over samples.

Keras:

axis: Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format="channels_first", set axis=1 in BatchNormalization.

source: Keras BatchNorm

and here they average over the features (channels)

What's the right way? How to transfer BN weights between the models?

Upvotes: 3

Views: 2126

Answers (1)

Andrea Quattrini
Andrea Quattrini

Reputation: 153

you can retrieve moving_mean and moving_variance from running_mean and running_var attributes of pytorch module

# torch weights, bias, running_mean, running_var corresponds to keras gamma, beta, moving mean, moving average

weights = torch_module.weight.numpy()  
bias = torch_module.bias.numpy()  
running_mean =  torch_module.running_mean.numpy()
running_var =  torch_module.running_var.numpy()

keras_module.set_weights([weights, bias, running_mean, running_var])

Upvotes: 0

Related Questions