Reputation: 349
I am using tensorflow to train a convnet with a set of 15000 training images with 22 classes. I have 2 conv layers and one fully connected layer. I have trained the network with the 15000 images and have experienced convergence and high accuracy on the training set.
However, my test set is experiencing much lower accuracy so I am assuming the network is over fitting. To combat this I added dropout before the fully connected layer of my network.
However, adding dropout has caused the network to never converge after many iterations. I was wondering why this may be. I have even used a high dropout probability (keep probability of .9) and have experienced the same results.
Upvotes: 7
Views: 3957
Reputation: 4201
By making your keep dropout probability 0.9 it means there's 10% chance of that neuron connection getting off in each iteration .So for dropout also there should be an optimum value.
As in the above you can understand with the dropout we are also scaling our neurons. The above case is 0.5 drop out. If it's o.9 then again there will a different scaling.
So, if it's 0.9 dropout keep probability, we need to scale it by 0.9. Which means we are getting 0.1 larger something in the testing.
Just by this you can get an idea how dropout can affect. So by some probabilities it can saturate your nodes etc which causes the non converging issue.
Upvotes: 1
Reputation: 1
You shouldn't put 0.9 for dropout with doing this you are losing feature in your training phase. As far as I've seen most of the dropouts have had a value between 0.2 or 0.5. However, using too much dropout could cause some problems in the training phase and a longer time to converge or even in some rare cases cause the network to learn something wrong.
you need to be careful with using of dropout as you can see the image below dropout prevents features from getting to the next layer to using too many dropout or a very high dropout value could kill the learning
DropoutImage
Upvotes: 0
Reputation: 1725
You can add dropout to your dense layers after convolutional layers and remove dropout from convolutional layers. If you want to have many more examples, you can put some white noise (5% random pixels) on each picture and have P, P' variant for each picture. This can improve your results.
Upvotes: 0