Reputation: 775
I have a neural network with three layers. I've tried using tanh and sigmoid functions for my activations and then the output layer is just a simple linear function (I'm trying to model a regression problem).
For some reason my model seems to have a hard cut off where it will never predict a value above some threshold (even though it should). What reason could there be for this?
Here is what predictions from the model look like (with sigmoid activations):
update:
With relu activation, and switching from gradient descent to Adam, and adding L2 regularization... the model predicts same value for every input...
Upvotes: 0
Views: 790
Reputation: 3631
I think your problem concerns the generalization/expressiveness of the model. Regression is a basic task, there should be no problem with the method itself, but problem with the execution. @DomJack explained how output is restricted for a specific set of parameters, but that only happens for anomaly data. In general, when training parameters would be tuned so that it will predict output correctly.
So first point is about the quality of training data. Make sure you have large enough training data (and it is split randomly if you split train/test from one dataset). Also, maybe trivial, but make sure you didn't mess up input/output value in preprocessing.
Another point is about the size of the network. Make sure you use large enough hidden layer.
Upvotes: 0
Reputation: 4183
A linear layer regressing a single value will have outputs of the form
output = bias + sum(kernel * inputs)
If inputs comes from a tanh
, then -1 <= inputs <= 1
, and hence
bias - sum(abs(kernel)) <= output <= bias + sum(abs(kernel))
If you want an unbounded output, consider using an unbounded activation on all intermediate layers, e.g. relu
.
Upvotes: 1