Reputation: 7089
I'm trying to port some pytorch code to tensorflow 2.0 and am having difficulty figuring out how to translate the convolution functions between the two. The way both libraries deal with padding is the sticking point. Basically, I'd like to understand how I can manually produce the padding that pytorch does under the hood so that I can translate that to tensorflow.
The code below works if I don't do any padding, but I can't figure out how to make the two implementations match once any padding is added.
output_padding = SOME NUMBER
padding = SOME OTHER NUMBER
strides = 128
tensor = np.random.rand(2, 258, 249)
filters = np.random.rand(258, 1, 256)
out_torch = F.conv_transpose1d(
torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides,
padding=padding,
output_padding=output_padding)
def pytorch_transpose_conv1d(inputs, filters, strides, padding, output_padding):
N, L_in = inputs.shape[0], inputs.shape[2]
out_channels, kernel_size = filters.shape[1], filters.shape[2]
time_out = (L_in - 1) * strides - 2 * padding + (kernel_size - 1) + output_padding + 1
padW = (kernel_size - 1) - padding
# HOW DO I PAD HERE TO GET THE SAME OUTPUT AS IN PYTORCH
inputs = tf.pad(inputs, [(?, ?), (?, ?), (?, ?)])
return tf.nn.conv1d_transpose(
inputs,
tf.transpose(filters, perm=(2, 1, 0)),
output_shape=(N, out_channels, time_out),
strides=strides,
padding="VALID",
data_format="NCW")
out_tf = pytorch_transpose_conv1d(tensor, filters, strides, padding, output_padding)
assert np.allclose(out_tf.numpy(), out_torch.numpy())
Upvotes: 1
Views: 4337
Reputation: 1515
To translate the convolution and transpose convolution functions (with padding padding) between the Pytorch and Tensorflow we need to understand first F.pad()
and tf.pad()
functions.
padding size
: The padding size by which to pad some dimensions of input are described starting from the last dimension
and moving forward.last dimension
of the input tensor, then pad has the form (padding_left, padding_right)last 3 dimensions
, (padding_left, padding_right, padding_top, padding_bottom, padding_front,padding_back)padding_size
: is an integer tensor with shape [n, 2]
, where n is the rank of the tensor. For each dimension D of input, paddings[D, 0] indicates how many values to add before the contents of tensor in that dimension, and paddings[D, 1]
indicates how many values to add after the contents of tensor in that dimension.Here's table representing F.pad and tf.pad equivalents along with output tensor for the input tensor
[[[1, 1], [1, 1]]]
which is of shape (1, 2, 2)
Let's now move to PyTorch padding in Convolution layers
F.conv1d(input, ..., padding, ...):
both sides
for padding number of points.padding=(size)
applies F.pad(input, [size, size])
i.e padding last dimension with (size, size) equivalent to tf.pad(input, [[0, 0], [0, 0], [size, size]])
F.conv2d(input, ..., padding, ...):
padding=(size)
applies F.pad(input, [size, size, size, size])
i.e padding last 2 dimensions with (size, size) equivalent to tf.pad(input, [[0, 0], [size, size], [size, size]])
padding=(size1, size2)
applies F.pad(input, [size2, size2, size1, size1])
which is equivalent to tf.pad(input, [[0, 0], [size1, size1], [size2, size2]])
PyTorch padding in Transpose Convolution layers
dilation * (kernel_size - 1) - padding
padding will be added to both
sides of each dimension in the input.Padding
in transposed
convolutions can be seen as allocating fake
outputs that will be removed
output_padding
controls the additional size added to one side of the output shapetranspose convolution
in pytorch
.Transpose Convolution
import torch
import torch.nn as nn
import torch.nn.functional as F
import tensorflow as tf
import numpy as np
# to stop tf checkfailed error not relevent to actual code
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
def tconv(tensor, filters, output_padding=0, padding=0, strides=1):
'''
tensor : input tensor of shape (batch_size, channels, W) i.e (NCW)
filters : input kernel of shape (in_ch, out_ch, kernel_size)
output_padding : single number must be smaller than either stride or dilation
padding : single number should be less or equal to ((valid output size + output padding) // 2)
strides : single number
'''
bs, in_ch, W = tensor.shape
in_ch, out_ch, k_sz = filters.shape
out_torch = F.conv_transpose1d(torch.from_numpy(tensor).float(),
torch.from_numpy(filters).float(),
stride=strides, padding=padding,
output_padding=output_padding)
out_torch = out_torch.numpy()
# output_size = (input_size - 1)*stride + (kerenel_size - 1) + 1 + output_padding - 2*padding
# valid out size -> padding=0, output_padding=0
# -> valid_out_size = (input_size - 1)*stride + (kerenel_size - 1) + 1
out_size = (W - 1)*strides + (k_sz - 1) + 1
# input shape -> (batch_size, W, in_ch) and filters shape -> (kernel_size, out_ch, in_ch) for tf conv
valid_tf = tf.nn.conv1d_transpose(np.transpose(tensor, axes=(0, 2, 1)),
np.transpose(filters, axes=(2, 1, 0)),
output_shape=(bs, out_size, out_ch),
strides=strides, padding='VALID',
data_format='NWC')
# output padding
tf_outpad = tf.pad(valid_tf, [[0, 0], [0, output_padding], [0, 0]])
# NWC to NCW
tf_outpad = np.transpose(tf_outpad, (0, 2, 1))
# padding -> input, begin, shape -> remove `padding` elements on both side
out_tf = tf.slice(tf_outpad, [0, 0, padding], [bs, out_ch, tf_outpad.shape[2]-2*padding])
out_tf = np.array(out_tf)
print('output size(tf, torch):', out_tf.shape, out_torch.shape)
# print('out_torch:\n', out_torch)
# print('out_tf:\n', out_tf)
print('outputs are close:', np.allclose(out_tf, out_torch))
tensor = np.random.rand(2, 1, 7)
filters = np.random.rand(1, 2, 3)
tconv(tensor, filters, output_padding=2, padding=5, strides=3)
Results
>>> tensor = np.random.rand(2, 258, 249)
>>> filters = np.random.rand(258, 1, 7)
>>> tconv(tensor, filters, output_padding=4, padding=9, strides=6)
output size(tf, torch): (2, 1, 1481) (2, 1, 1481)
outputs are close: True
Some useful links:
Upvotes: 3