Reputation: 111
My understanding of neural network dropout is that you essentially set the value of the neuron you want to dropout to zero.
However, if one of your inputs is a range of numbers that can include zero, then doesn't that mean that the value zero being set by the dropout could be confused for the value zero as a legitimate input?
Upvotes: 2
Views: 777
Reputation: 2585
Think about a neural network from a brain perspective. The output of neuron behaves like the electrical signal. If the output is zero it means that there is no signal. By design, neuron computes its output as a product of weights and previous layer outputs and hence zero times anything will produce zero. The neuron produced zero output does not have a contribution to the following signal propagation.
Also, NN is an approximation and hence you could not expect that it will produce the exact values you want. You feed it the input and get the signal back. For some input values, you might expect the signal is greater than for others and you during training you learn NN to do this. Take binary classification, you might expect that for positives labels NN should produce signals greater or equal than zero and for negatives less than zero.
Upvotes: 1
Reputation: 51
...Dropout should not be applied to inputs...
I don't think this is the case:
It is not uncommon to use dropout on the inputs. In the original paper the authors usually use dropout with a retention rate of 50% for hidden units and 80% for (real-valued) inputs. For inputs that represent categorical values (e.g. one-hot encoded) a simple dropout procedure might not be appropriate.
To answer your question; I think its easier to treat it like synthetic noise in this case. So yeah; it could never cover up a zero value with 'noise' and that is indeed a problem.
So reasoning it out, instead of all values being equally suspect, high values are trustworthy and zeros are suspect. You'd be probably better off just using Gaussian white noise like normal instead of messing around with all this. Unless your real testing set might actually be missing inputs... Then this seems like as good a method as any.
Confusing zero inputs for unknown will cause problems; but so will setting unknown inputs to -1 or whatever; it will create and additional disjoint mode for every input and impede generalization whatever you do.
Offhand I'd suggest that if you set unknown values to the mean expected value it would do the least harm in this regard. So if you normalized you data before hand(which you should have) zero would be the mean and everything should work as well as it's going to.
P.S. it occurs to me you could get a more accurate expected mean if you used correlations with the known values to guess the unknown ones, assuming they are correlated. At that point you couldn't use a dropout layer anymore; so there are probably better ways to handle this.
Upvotes: 0
Reputation: 56377
No, because Dropout should not be applied to inputs, only to the output of a given layer. It would make no sense to apply Dropout to the input layer.
Upvotes: 0
Reputation: 6640
Yes, dropout is that you essentially set the value of the neuron you want to dropout to zero, but the key is to implement this randomly. For each hidden layer, you can set a different dropout rate. Your input layer is not part of the dropout.
Upvotes: 0