Soheila Hesaraki
Soheila Hesaraki

Reputation: 41

Implement "same" padding for convolution operations with dilation > 1, in Pytorch

I am using Pytorch 1.8.1 and although I know the newer version has padding "same" option, for some reasons I do not want to upgrade it. To implement same padding for CNN with stride 1 and dilation >1, I put padding as follows:

 padding=(dilation*(cnn_kernel_size[0]-1)//2, dilation*(cnn_kernel_size[1]-1)//2))

According to the Pytorch document, I expected the input and output size will be the same, but it did not happen!

It is written in Pytorch document that:

Hout​=⌊( Hin​ + 2×padding[0] − dilation[0]×(kernel_size[0]−1) −1) /stride[0] ​+ 1⌋

Wout​=⌊( Win​ + 2×padding[1] − dilation[1]×(kernel_size[1]−1) −1) /stride[1] + 1⌋

The input of torch.nn.Conv2d was with the shape of (1,1,625,513) which based on the Conv2d pytorch document, indicates batch size = 1, C in = 1, H in = 625 and Win = 513

and after using:

Putting those values in the formulas above gives us:

Hout​=⌊(625 ​+ 2×35 −5×(15−1) −1) /1 ​+1⌋=⌊(625 ​+ 70 −5×14 -1) + 1⌋=625

Wout​=⌊(513 ​+ 2×35 −5×(15−1) −1) /1 ​+1⌋=⌊(513 ​+ 70 −5×14 -1) + 1⌋=513

However, the given output shape by pytorch was (1,64,681,569)

I can understand the value of 1 and C out = 64. But I don't know why H out and W out are not the same as H in and W in? Does anyone has any explanation that can help?

Upvotes: 1

Views: 1877

Answers (2)

Soheila Hesaraki
Soheila Hesaraki

Reputation: 41

I figured it out! The reason that I ended up with the wrong dimension was that I didn't put a numeric value for padding. I gave it the numeric value of dilation and based on that it calculate itself the value for padding as

padding=(dilation*(cnn_kernel_size[0]-1)//2, dilation*(cnn_kernel_size[1]-1)//2))

I think Pytorch needs to be given the numeric value of padding because when I change my code and gave the network the value of padding and calculate the dilation based on padding (Dilation = (2*padding)/(kernel size -1) I got the right output shape.

Upvotes: 2

Shai
Shai

Reputation: 114786

I think your calculations are correct. The padding should be 35 pixels.
For some reason, I am unable to reproduce the output shape you report.

Testing this, yield the desired output size:

import torch

conv = torch.nn.Conv2d(1, 64, kernel_size=15, dilation=5, padding=35, stride=1)
conv(torch.rand(1, 1, 625, 513)).shape

Yields

torch.Size([1, 64, 625, 513])

As expected.

Upvotes: 1

Related Questions