Reputation: 11
I have implemented a convolutional neural network with batch normalization on 1D input signal. My model has a pretty good accuracy of ~80%. Here is the order of my layers: (Conv1D, Batch, ReLU, MaxPooling) repeat 6 times, Conv1D, Batch, ReLU, Dense, Softmax.
I have seen several articles saying that I should NOT use dropout on convolutional layers, but I should use batch normalization instead, so I want to experiment with my models by replacing all batch normalization layers with dropout layers to see if dropout will really make my performance worse.
My new model has the following structure: (Conv1D, Dropout, ReLU, MaxPooling) repeat 6 times, Conv1D, Dropout, ReLU, Dense, Softmax. I have tried dropout rates of 0.1, 0.2, 0.3, 0.4, 0.5. The performance of my new model is only ~25%, much worse than my original model, and even worse than predicting the dominating class (~40%).
I wonder if the huge difference in performance is actually the result from replacing batch normalization with dropout. or is it my misunderstanding of how I should use dropout.
Upvotes: 0
Views: 5178
Reputation: 241
To get an intuition on how to use batch norm and dropout, you should first understand what these layers do:
What you did is replace your normalization layers with layers that add extra noise to the information flow, which of course leads to a drastic decrease of accuracy.
My recommendation for you is to use batch norm just like in your first setup and if you want to experiment with dropout, add it after the activation function was applied to the previous layer. Usually, dropout is used to regularize dense layers which are very prone to overfit. Try this:
Upvotes: 1