Reputation: 174
I am trying to do sequence classification by first passing data to RNN and then to Linear, normally I would just reshape the output from [batch_size, sequence_size, hidden_size]
to [batch_size, sequence_size*hidden_size]
to pass it to Linear, but in this case I have sequence of varying lengths, so the output of RNN might be for example [batch_size, 32, hidden_size]
or [batch_size, 29, hidden_size]
, so I don’t know with what shape to initialize the Linear layer (in place of question marks in the code below). Is it at all possible?
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, num_classes=4):
super().__init__()
self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size*????, num_classes)
def forward(self, x):
#x=[batch_size, sequence_length]
out, h_n = self.rnn(x) #out=[batch_size, sequence_length, hidden_size]
out = torch.reshape(out, (BATCH_SIZE, -1)) #out=[batch_size, sequence_length*hidden_size]
out = self.fc(out) #out=[batch_size, num_classes]
return x
Currently each batch is padded to the longest sequence in the batch, is it better to just pad all the sequences to the same length to get rid of this problem? Is changing shape of input to Linear causing some bad side effects?
Upvotes: 0
Views: 880
Reputation: 1455
Linear layers are meant to take fixed number of features as inputs. If you really want to pass a variable sized input, you could try some kind of imputation (e.g., create a linear layer suited for maximum length and whenever input size is lesser, pad with mean value of feature). But I don't think that's desirable in this scenario (or most of scenarios for that matter).
Since you want to do sequence classification, why don't you just pass a summary of inputs created by the RNN (output at last time step) instead of the whole output i.e., pass [batch_size, hidden_size]
(last output of RNN) as input to linear layer. Then your linear layer will be hidden_size x num_classes
This is how the code would look like:
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, num_classes=4):
super().__init__()
self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size, num_classes)
def forward(self, x):
#x=[batch_size, sequence_length]
out, h_n = self.rnn(x) #out=[batch_size, sequence_length, hidden_size]
out = out[:, -1, :] #out=[batch_size, hidden_size]
out = self.fc(out) #out=[batch_size, num_classes]
return x
This would work because RNN can learn to summarise the whole input into the last step. Also, you can try LSTM instead of RNN as it might be able to achieve this summarisation even better (since it can handle long-term dependencies more easily).
Upvotes: 1