Reputation: 41
I am using Pytorch 1.8.1 and although I know the newer version has padding "same" option, for some reasons I do not want to upgrade it. To implement same padding for CNN with stride 1 and dilation >1, I put padding as follows:
padding=(dilation*(cnn_kernel_size[0]-1)//2, dilation*(cnn_kernel_size[1]-1)//2))
According to the Pytorch document, I expected the input and output size will be the same, but it did not happen!
It is written in Pytorch document that:
Hout=⌊( Hin + 2×padding[0] − dilation[0]×(kernel_size[0]−1) −1) /stride[0] + 1⌋
Wout=⌊( Win + 2×padding[1] − dilation[1]×(kernel_size[1]−1) −1) /stride[1] + 1⌋
The input of torch.nn.Conv2d was with the shape of (1,1,625,513) which based on the Conv2d pytorch document, indicates batch size = 1, C in = 1, H in = 625 and Win = 513
and after using:
Putting those values in the formulas above gives us:
Hout=⌊(625 + 2×35 −5×(15−1) −1) /1 +1⌋=⌊(625 + 70 −5×14 -1) + 1⌋=625
Wout=⌊(513 + 2×35 −5×(15−1) −1) /1 +1⌋=⌊(513 + 70 −5×14 -1) + 1⌋=513
However, the given output shape by pytorch was (1,64,681,569)
I can understand the value of 1 and C out = 64. But I don't know why H out and W out are not the same as H in and W in? Does anyone has any explanation that can help?
Upvotes: 1
Views: 1877
Reputation: 41
I figured it out! The reason that I ended up with the wrong dimension was that I didn't put a numeric value for padding. I gave it the numeric value of dilation and based on that it calculate itself the value for padding as
padding=(dilation*(cnn_kernel_size[0]-1)//2, dilation*(cnn_kernel_size[1]-1)//2))
I think Pytorch needs to be given the numeric value of padding because when I change my code and gave the network the value of padding and calculate the dilation based on padding (Dilation = (2*padding)/(kernel size -1) I got the right output shape.
Upvotes: 2
Reputation: 114786
I think your calculations are correct. The padding should be 35 pixels.
For some reason, I am unable to reproduce the output shape you report.
Testing this, yield the desired output size:
import torch
conv = torch.nn.Conv2d(1, 64, kernel_size=15, dilation=5, padding=35, stride=1)
conv(torch.rand(1, 1, 625, 513)).shape
Yields
torch.Size([1, 64, 625, 513])
As expected.
Upvotes: 1