Reputation: 83
Given a training dataset, Xtrain (m x n) and ytrain(m,) and some neural net sequential model.
When and to what range does the training data have to be normalized too? How should predicted values be denormalized? And how do the choices of activation functions of different layers effect this?
and for the target (ytrain) used in training:
very confused, so shedding any light on this for me would be greatly appreciated.
Upvotes: 0
Views: 254
Reputation: 2089
Do we have to normalize the Xtrain data?
- Yes, we have to
Does the range we normalize depend on the input layers activation function?
- No, it doesn't
Does that have to denormalized?
- No
Does it have to be normalized to the range of output layers activation function or common range of all layers?
- Actually, I didn't get the question. But let me explain how normalization works.
The main aim of normalization - bring features to a single view. Standardization improves the numerical stability of your model. If you have different features (e.g. some numeric columns in range between 1000 and 20000, some numeric columns in range between -10 and 5, some boolean columns, etc.) you must make a standardization. This will turn your very different into similar ones.
But why do we need it? In neural networks every neuron takes features and weights as input:
g(X) = X^T * w
So, if some of your features is bigger than others, the model will pay more attention to large numbers.
Speaking about denormalization. Do we need to denormalizate y values? No, we don't. Since we didn't normalize y_train features, which the model was trained on, we don't need to denormalizate predicted values.
Upvotes: 1