Reputation: 163
from the tensorflow documentation:
https://www.tensorflow.org/api_docs/python/tf/keras/layers/BatchNormalization
"Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1."
therefore, I expect that this layer should first calculate the mean and standard deviation of the previous layer output, subtract it by the mean, and divide by the standard deviation for each sample in the batch. But apparently I'm wrong.
import numpy as np
import tensorflow as tf
if __name__ == "__main__":
# flattened tensor, batch size of 2
xnp = np.array([[1,2,3],[4,5,6]])
xtens = tf.constant(xnp,dtype=tf.float32)
nbatchnorm = tf.keras.layers.BatchNormalization()(xtens)
# tensorflow output
print(nbatchnorm)
# what I expect to see
xmean = np.mean(xnp,axis=1)
xstd = np.std(xnp,axis=1)
# set the mean to 0 and the standard deviation to 1 for each sample
normalized = (xnp - xmean.reshape(-1,1)) / xstd.reshape(-1,1)
print(normalized)
output:
tf.Tensor(
[[0.9995004 1.9990008 2.9985013]
[3.9980016 4.997502 5.9970026]], shape=(2, 3), dtype=float32)
[[-1.22474487 0. 1.22474487]
[-1.22474487 0. 1.22474487]]
Can someone please explain to me why these outputs are not the same or atleast similar? I dont see how this is normalizing anything.
Upvotes: 2
Views: 741
Reputation:
Well, Batch Normalization
depends on numerous factors on its algorithm which is explained below.
Upvotes: 1