Zenon Lambrou
Zenon Lambrou

Reputation: 11

LSTM layer followed by BatchNorm Layer error in dimensions

I am trying to train a model with the following architecture :

self.lstm1 = nn.LSTM(in_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm1 = nn.BatchNorm1d(hidden_channels)

self.lstm2 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm2 = nn.BatchNorm1d(hidden_channels)

self.lstm3 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm3 = nn.BatchNorm1d(hidden_channels)

self.fc1 = nn.Linear(hidden_channels, out_channels)
in_channels = 105
hidden_channels = 128
batch_size = 32
Dimension of the data is : (32,770,105)
Dimension of the output from lstm1 : (32,770,128)

When I am trying to train the network when I reach batchnorm1 layer I am getting this error:

RuntimeError: running_mean should contain 770 elements not 128

Can you let me know where the mistake is ?

I tried to use permeute to change the output dimesnion from (32,770,128) to (32,128,770) but still I was getting different error.

Upvotes: 0

Views: 37

Answers (1)

Matmozaur
Matmozaur

Reputation: 539

Batch normalization is performed before lstm layer (regardless of the order you add them to your network), so You should set:

in_features = 770
self.batchnorm1 = nn.BatchNorm1d(in_features)
self.lstm1 = nn.LSTM(in_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)


self.batchnorm2 = nn.BatchNorm1d(hidden_channels)
self.lstm2 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)

self.batchnorm3 = nn.BatchNorm1d(hidden_channels)
self.lstm3 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)

self.fc1 = nn.Linear(hidden_channels, out_channels)

Upvotes: 0

Related Questions