What to set in axis parameter of BatchNormalization layer for time series data?

I am training LSTM network over time series data and would like to normalize data, because my features are of different scale.

My data shape is

(n_samples x n_timestamps x n_features)

I would like to use BatchNormalization layer.

Should I set axis to 2 (features, as stated in docs) or 1 (timestamps)? I would like my features to go into [0..1] range, while they are of very different scale.

The problem is documentation doesn't say what does this layer actually do, but instead it gives recommendations for CNN.

Upvotes: 3

Answers (1)

Daniel Möller

Reputation: 86650

Usually, you'd use the features dimension: -1.

It will treat each feature individually and normalize based on every other dimension. But it will not make them go into the range 0 to 1. It will use (x - mean)/variance and apply a scale factor and bias after the normalization.

For, instance. Take feature 0:

see values of feature 0 for all samples and all time steps in a batch
get the mean and variance of all these values
calculate the normalized value of feature zero for all samples and steps
apply a scale factor for feature 0
apply a bias for feature 0

Repeat the same for feature 1, with another mean, another variance, scale and bias.

If you use the timesteps dimension, it will see each step individually and give one scale factor for each step, which would not make much sense as steps should all have similar nature, differently from features which can mean completely different things.

If you do need things between 0 and 1, you can simply apply an Activatoin('sigmoid'). If you fear that your values will be too saturated, you can apply a BatchNormalization() then an Activatoin('sigmoid').

Upvotes: 3

What to set in axis parameter of BatchNormalization layer for time series data?

Answers (1)

Related Questions