Reputation: 231
I am reading a paper which implementing CNN but i dont understand this sentence Instead of using max-pooling layers to reduce the size of the feature maps, we use convolution layers with increased strides. i don't know how this can replace max pooling , what i am missing ?
Upvotes: 3
Views: 4296
Reputation: 136379
Naively speaking, a filter of a CNN works by moving the filter matrix (e.g. 3x3x1 for simplicity as in the following image) over every possible possition. This means you move the filter by one to the right each time and when the row is ready you go back and down.
In the following gif, the original image is dashed, the filter is the gray-ish thing and the result is the green image:
However, you could also move by 2 every time. It is the same as if you would simply subsample the result. If you move by a stride of 2, you divide the feature map dimensions by 2 (both). This means your feature map only has 1/4th of the original size. This is exactly the same way how pooling reduces the feature map size. In fact, convolutional filters can learn average and max pooling.
Upvotes: 5