Reputation: 3417
The above image is from a pdf by Yann LeCun, titled "Hierarchical Models Of Perception and Reasoning"
I am not able to understand the how the layer 2 is 14X14 feature maps? How can 75X75 matrix with 10X10 pooling and 5X5 subsampling gives 14X14 matrix ?
Upvotes: 1
Views: 3971
Reputation: 16121
If you refer to this other paper by LeCun et al. an identical network is used with a larger input (143x143 grayscale image):
The first stage has 64 filters of size 9x9, followed by a subsampling layer with 5x5 stride, and 10x10 averaging window. [...]
This gives the right dimension:
output size = (input size - window size) / step + 1
= (75-10) / 5 + 1
= 14
Upvotes: 5