Reputation:
I have a 3D tensor of names that comes out of an LSTM that's of shape (batch size x name length x embedding size)
I've been reshaping it to a 2D to put it through a linear layer, because linear layer requires (batch size, linear dimension size) by using the following
y0 = output.contiguous().view(-1, output.size(-1))
this converts outputs to (batchsize, name length * number of possible characters)
then I put y0 through a linear layer and then reshape it back to a 3D using
y = y0.contiguous().view(output.size(0), -1, y0.size(-1))
But I'm not really sure if the fibers of y are correlated properly with the cells of output and I worry this is messing up my learning, because batch size of 1 is actually generating proper names and any larger batch size is generating nonsense.
So what I mean exactly is
output = (batch size * name length, embed size)
y = (batch size * name length, number of possible characters)
I need to make sure y[i,j,:] is the linear transformed version of output[i,j,:]
Upvotes: 0
Views: 400
Reputation: 36
It seems like you are using an older code example. Just 'comment out' the lines of code where you reshape the tensor as there is no need for them.
This link gives you a bit more explaination: https://discuss.pytorch.org/t/when-and-why-do-we-use-contiguous/47588
Try something like this instead and take the output from the LSTM directly into the linear layer:
output, hidden = self.lstm(x, hidden)
output = self.LinearLayer1(output)
Upvotes: 1