Reputation: 13
I have some keras code that I need to convert to Pytorch. I've done some research but so far I am not able to reproduce the results I got from keras. I have spent many hours on this any tips or help is very appreciated.
Here is the keras code I am dealing with. The input shape is (None, 105, 768) where None is the batch size and I want to apply Conv1D to the input. The desire output in keras is (None, 105)
x = tf.keras.layers.Dropout(0.2)(input)
x = tf.keras.layers.Conv1D(1,1)(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Activation('softmax')(x)
What I've tried, but worse in term of results:
self.conv1d = nn.Conv1d(768, 1, 1)
self.dropout = nn.Dropout(0.2)
self.softmax = nn.Softmax()
def forward(self, input):
x = self.dropout(input)
x = x.view(x.shape[0],x.shape[2],x.shape[1])
x = self.conv1d(x)
x = torch.squeeze(x, 1)
x = self.softmax(x)
Upvotes: 1
Views: 1051
Reputation: 32972
The culprit is your attempt to swap the dimensions of the input around, since Keras and PyTorch have different conventions for the dimension order.
x = x.view(x.shape[0],x.shape[2],x.shape[1])
.view()
does not swap the dimensions, but changes which part of the data is part of a given dimension. You can consider it as a 1D array, then you decide how many steps you take to cover the dimension. An example makes it much simpler to understand.
# Let's start with a 1D tensor
# That's how the underlying data looks in memory.
x = torch.arange(6)
# => tensor([0, 1, 2, 3, 4, 5])
# How the tensor looks when using Keras' convention (expected input)
keras_version = x.view(2, 3)
# => tensor([[0, 1, 2],
# [3, 4, 5]])
# Vertical isn't swapped with horizontal, but the data is arranged differently
# The numbers are still incrementing from left to right
incorrect_pytorch_version = keras_version.view(3, 2)
# => tensor([[0, 1],
# [2, 3],
# [4, 5]])
To swap the dimensions you need to use torch.transpose
.
correct_pytorch_version = keras_version.transpose(0, 1)
# => tensor([[0, 3],
# [1, 4],
# [2, 5]])
Upvotes: 1