Confused about calculation of convolutional layer shapes

Question

I am new in this forum and I have started studying the theory of CNN. It is probably a stupid question but I am confused about the calculation of the CNN outputs shape. I am following a course on Udacity and in one of the tutorials they provide this CNN architecture.

import torch.nn as nn
import torch.nn.functional as F

# define the CNN architecture
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # convolutional layer (sees 32x32x3 image tensor)
        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
        # convolutional layer (sees 16x16x16 tensor)
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        # convolutional layer (sees 8x8x32 tensor)
        self.conv3 = nn.Conv2d(32, 64, 3, padding=1)
        # max pooling layer
        self.pool = nn.MaxPool2d(2, 2)
        # linear layer (64 * 4 * 4 -> 500)
        self.fc1 = nn.Linear(64 * 4 * 4, 500)
        # linear layer (500 -> 10)
        self.fc2 = nn.Linear(500, 10)
        # dropout layer (p=0.25)
        self.dropout = nn.Dropout(0.25)

Could you please help in understanding the way they calculate the outputs of the CNN layers? (The starting shape of the images is 32x32x3) More specifically how did they end up with this:

# linear layer (64 * 4 * 4 -> 500)
self.fc1 = nn.Linear(64 * 4 * 4, 500)

Thanks a lot

xiawi · Accepted Answer

It misses the definition of the forward pass and one can guess there is a 2x2 pooling after each conv layer. Hence, the pooling implies a subsampling each time (see the comments) and the 32x32 images becomes 16x16 after conv1 (+ 2x2 pooling), 8x8 after conv2 (+ 2x2 pooling) and 4x4 after conv3 (+ 2x2 pooling). Since conv3 has 64 filters, it outputs 64 feature maps of size 4x4. Then, the fc1 maps this tensor to a fully connected layer of size 500. It is exactly what is defined by the line

self.fc1 = nn.Linear(64 * 4 * 4, 500)

Confused about calculation of convolutional layer shapes

Answers (1)

Related Questions