Reputation: 3616
It is common to use a dropout rate of 0.5
as a default which I also use in my fully-connected network. This advise follows the recommendations from the original Dropout paper (Hinton at al).
My network consists of fully-connected layers of size
[1000, 500, 100, 10, 100, 500, 1000, 20]
.
I do not apply dropout to the last layer. But I do apply it to the bottle neck layer of size 10. This does not seem reasonable given that dropout = 0.5
. I guess to much information gets lost. Is there a rule of thumb how to treat bottle neck layers when using dropout? Is it better to increase the size of the bottle neck or decrease dropout rate?
Upvotes: 0
Views: 486
Reputation: 147
Drop out layer is added to prevent over-fitting(relgularization) in neural Network.
Firstly Drop out rate adds noise in output values of layer to break happenstance patterns that cause overfitting .
here droput rate of 0.5 means 50% of values shall be droped out, which is a high noise ratio and a definite No for bottle neck layer.
I would recommend you train your bottle neck layer without dropout first and then compare its results with increasing dropout.
choose the model that best validates your test Data.
Upvotes: 2