Reputation: 441
I want to implement lstms with CNN in pytorch as my data is a time series data i.e. frames of video for heart rate detection, I am struggling with the input and output dimensions for lstms what and how i should properly configure the dimensions/parameters/arguments at input of lstms in pytorch as its quite confusing when considering time steps, hidden state etc. my output from CNN is “2 batches of 256 frames”, which is now the input to lstms batch is 2 features =256 the output is also a batch of 2 with 256 frames.
Upvotes: 1
Views: 1458
Reputation: 1741
Generally, the input shape of sequential data takes the form (batch_size, seq_len, num_features)
. Based on your explanation, I assume your input is of the form (2, 256)
, where 2 is the batch size and 256 is the sequence length of scalars (1-dimensional tensor). Therefore, you should reshape your input to be (2, 256, 1)
by inputs.unsqueeze(2)
.
To declare and use an LSTM model, simply try
from torch import nn
model = nn.LSTM(
input_size=1, # 1-dimensional features
batch_first=True, # batch is the first (zero-th) dimension
hidden_size=some_hidden_size, # maybe 64, 128, etc.
num_layers=some_num_layers, # maybe 1 or 2
proj_size=1, # output should also be 1-dimensional
)
outputs, (hidden_state, cell_state) = model(inputs)
Upvotes: 1