Reputation: 217
I'm trying to convert the following Keras model code to pytorch, but am having problems dealing with padding='same'.
model = Sequential()
model.add(Conv2D(64, (3, 3), input_shape=img_size))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
Which produces the following summary:
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 30, 30, 64) 1792
_________________________________________________________________
batch_normalization_1 (Batch (None, 30, 30, 64) 120
_________________________________________________________________
activation_1 (Activation) (None, 30, 30, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 30, 30, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 30, 30, 64) 36928
_________________________________________________________________
batch_normalization_2 (Batch (None, 30, 30, 64) 120
_________________________________________________________________
activation_2 (Activation) (None, 30, 30, 64) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 15, 15, 64) 0
=================================================================
Total params: 38,960
Trainable params: 38,840
Non-trainable params: 120
Right now, I would write:
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3,
bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.Dropout(0.3),
nn.Conv2d(64, 64, kernel_size=3, padding = ?
bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, padding = ?),
)
Where padding should have numerical value. I was wondering if there is an easier way to calculate this since we're using padding='same'.
Also, the next line of the Keras model looks like:
model.add(Conv2D(128, (3, 3), padding='same'))
So I really need to brush up on how to calculate padding, especially after stride too. From a rough eye only, is the padding 2?
Upvotes: 8
Views: 17025
Reputation: 6618
Please use the below conv2d
when you want to use padding = 'same'
as in keras
with stride = 2
or more
import torch
import torch.nn.functional as F
class Conv2dSame(torch.nn.Conv2d):
def calc_same_pad(self, i: int, k: int, s: int, d: int) -> int:
return max((math.ceil(i / s) - 1) * s + (k - 1) * d + 1 - i, 0)
def forward(self, x: torch.Tensor) -> torch.Tensor:
ih, iw = x.size()[-2:]
pad_h = self.calc_same_pad(i=ih, k=self.kernel_size[0], s=self.stride[0], d=self.dilation[0])
pad_w = self.calc_same_pad(i=iw, k=self.kernel_size[1], s=self.stride[1], d=self.dilation[1])
if pad_h > 0 or pad_w > 0:
x = F.pad(
x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2]
)
return F.conv2d(
x,
self.weight,
self.bias,
self.stride,
self.padding,
self.dilation,
self.groups,
)
conv_layer_s2_same = Conv2dSame(in_channels=3, out_channels=64, kernel_size=(7, 7), stride=(2, 2), groups=1, bias=True)
out = conv_layer_s2_same(torch.zeros(1, 3, 224, 224))
credit : Captum
Upvotes: 2
Reputation: 3736
In PyTorch you can directly use integer in padding.
In convolution padding = 1 for 3x3 kernel and stride=1 is ~ "same" in keras.
And In MaxPool you should set padding=0 (default), for 2x2 kernel, stride=2 is ~ "same" in keras.
You can use Formula:
Out = (W+2P-K)/S + 1
Let see some mathematical calculation:
For Convolution:
Case 1:
input is 30x30, kernel_size(K) is 3x3, stride=1, padding=1:
Out = (30+2*1-3)/1 + 1 = floor(29/1) + 1 = 30 i.e 30x30 (~ padding="same")
Case 2:
input is 30x30, kernel_size(K) is 3x3, stride=1, padding=0:
Out = (30+2*0-3)/1 + 1 = floor(27/1) + 1 = 28 i.e 28x28 (~ padding="valid")
For MaxPooling:
Case 1:
input is 30x30, kernel_size(K) is 2x2, stride=2, padding=0:
Out = (30+2*0-2)/2 + 1 = floor(28/2) + 1 = 15 i.e 15x15 (~ padding="same")
Case 2:
input is 30x30, kernel_size(K) is 2x2, stride=2, padding=1:
Out = (30+2*1-2)/2 + 1 = floor(30/2) + 1 = 16 i.e 16x16 (~ padding="valid")
Here is a program implemented in PyTorch same as above keras code:
model = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=(3, 3), padding=1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.Dropout(0.3),
nn.Conv2d(64, 64, kernel_size=(3, 3), padding=1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=(2, 2), stride=2, padding=0)
)
X = torch.rand((1, 3, 30, 30))
print(model)
for layer in model:
X = layer(X)
print(X.shape)
Output:
Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.3, inplace=False)
(4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU()
(7): MaxPool2d(kernel_size=(2, 2), stride=2, padding=0, dilation=1, ceil_mode=False)
)
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 30, 30])
torch.Size([1, 64, 15, 15])
Upvotes: 3
Reputation: 1213
A complete formula to calculate padding can be found in the documentation of PyTorch:
Source: https://pytorch.org/docs/master/generated/torch.nn.Conv2d.html?highlight=conv2d#torch.nn.Conv2d
This formula includes kernel size
, stride
, and dilation
.
Based on this equation, you can iterate using different padding
sizes (as guesses), until you can find an appropiate value to solve the equation.
Depending on the size of your images, you can create a solver using binary search to find the optimal padding
value, or you can just try different values incrementing padding += 1
if the images are not too large.
Upvotes: 1
Reputation: 31
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3,
bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.Dropout(0.3),
nn.Conv2d(64, 64, kernel_size=3, padding = 1
bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, padding = 32),
)
Upvotes: 3
Reputation: 201
W:input volume size
F:kernel size
S:stride
P:amount of padding
size of output volume = (W-F+2P)/S+1
e.g.
input:7x7, kernel:3x3, stride:1, pad:0
output size = (7-3+2*0)/1+1 = 5 =>5x5
Upvotes: 6
Reputation: 2079
The formula is: k = (n - 1) / 2, where n is kernel size. Here's a visualization:
Upvotes: 1