Rishik
Rishik

Reputation: 111

What are the shapes of beta and gamma parameters in layer normalization layer?

In layer normalization, we compute mean and variance across the input layer (instead of across batch which is what we do in batch normalization). And then normalize the input layer according to mean and variance, and then return gamma times normalized layer plus beta.

My question is, are the gamma and beta scalars with shape (1, 1) and (1, 1) respectively or their shapes are (1, number of hidden units) and (1, number of hidden units) respectively.

Here is how I have implemented the layer normalization, is this correct!

def layernorm(layer, gamma, beta):
    mean = np.mean(layer, axis = 1, keepdims = True)
    variance = np.mean((layer - mean) ** 2, axis=1, keepdims = True)
    layer_hat = (layer - mean) * 1.0 / np.sqrt(variance + 1e-8)
    outpus = gamma * layer_hat + beta
    return outpus

where gamma and beta are defined as below:

gamma = np.random.normal(1, 128)
beta = np.random.normal(1, 128)

Upvotes: 1

Views: 2329

Answers (1)

Maybe
Maybe

Reputation: 2279

According to the Tensorflow's implementation, assume the input has shape [B, rest], gamma and beta are of shape rest. rest could be (h, ) for a 2-dimensional input or (h, w, c) for a 4-dimensional input.

Upvotes: 1

Related Questions