Reputation: 6602
For example, as to the famous AlexNet architecutre
(original paper), what's the difference of using two 3*3 convolution filters
between using one 5*5 convolution filter
?
The two 3*3 convolution filters
and one 5*5 convolution filter
have been highlighted by red rectangle
in the below image.
What about use another 5*5 convolution filter
to supersede the two 3*3 convolution filters
, or vice verse?
Upvotes: 1
Views: 3723
Reputation: 795
If you still have some doubt, hope this one helps.
If you stack two 3x3 conv layers, it eventually gets a receptive field of 5 (same as one 5x5 conv layer) with respect to the input. However, the advantage of using a smaller conv layer like 3x3 is it needs less parameter (you can do the parameter calculation of two 3x3 layers and one 5x5 layer --> like 2*(33) = 18 and 1(5*5) = 25 assuming 1 channel). Also, two conv layer gets more non-linearity in between than one 5x5 layer, so it has got more discriminative power.
For the receptive field part, I hope this paper of mine helps you to visualize (BTW it's the answer sheet from my exam):
Upvotes: 2
Reputation: 6602
I have found from paper <<Very Deep Convolutional Networks for Large-Scale Image Recognition>>
.
Rather than using relatively large receptive fields in the first conv. layers (e.g. 11×11with stride 4 in (Krizhevsky et al., 2012), or 7×7 with stride 2 in (Zeiler & Fergus, 2013; Sermanet et al., 2014)), we use very small 3 × 3 receptive fields throughout the whole net, which are convolved with the input at every pixel (with stride 1). It is easy to see that a stack of two 3×3 conv.layers (without spatial poolingin between) has an effective receptive field of 5×5; three such layers have a 7 × 7 effective receptive field.
two 3*3 convolution filter is equivalent to one 5*5 convolution filter.
two 3*3 convolution filter will have less parameters than one 5*5 convolution filter.
two 3*3 convolution filter will make network more deep and extract more complex features than one 5*5 convolution filter.
paper:https://arxiv.org/pdf/1409.1556.pdf
Upvotes: 2