Can someone explain the behaviour of tf.keras.layers.BatchNormalization?

Question

from the tensorflow documentation:

https://www.tensorflow.org/api_docs/python/tf/keras/layers/BatchNormalization

"Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1."

therefore, I expect that this layer should first calculate the mean and standard deviation of the previous layer output, subtract it by the mean, and divide by the standard deviation for each sample in the batch. But apparently I'm wrong.

import numpy as np
import tensorflow as tf


if __name__ == "__main__":
    # flattened tensor, batch size of 2
    xnp = np.array([[1,2,3],[4,5,6]])
    xtens = tf.constant(xnp,dtype=tf.float32)

    nbatchnorm = tf.keras.layers.BatchNormalization()(xtens)

    # tensorflow output
    print(nbatchnorm)

    # what I expect to see
    xmean = np.mean(xnp,axis=1)
    xstd = np.std(xnp,axis=1)
    # set the mean to 0 and the standard deviation to 1 for each sample
    normalized = (xnp - xmean.reshape(-1,1)) / xstd.reshape(-1,1)

    print(normalized)

output:

tf.Tensor(
[[0.9995004 1.9990008 2.9985013]                                                                                                     
 [3.9980016 4.997502  5.9970026]], shape=(2, 3), dtype=float32)                                                 

[[-1.22474487  0.          1.22474487]           
 [-1.22474487  0.          1.22474487]]

Can someone please explain to me why these outputs are not the same or atleast similar? I dont see how this is normalizing anything.

Can someone explain the behaviour of tf.keras.layers.BatchNormalization?

Answers (1)

Related Questions