user17515752
user17515752

Reputation: 75

PyTorch: LSTM predicts the same constant value

I want to predict one variable using 7 features with time steps of 4:

# Shape X_train: torch.Size([24433, 4, 7]
# Shape Y_train: torch.Size([24433, 4, 1]

# Shape X_test: torch.Size([6109, 4, 7]
# Shape Y_test: torch.Size([6109, 4, 1]

train_dataset = TensorDataset(X_train, Y_train)
test_dataset = TensorDataset(X_test, Y_test) 

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=False)

My (initial) LSTM model:

class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.linear = nn.Linear(hidden_size, output_size)
        
    def forward(self, x):
        x, _ = self.lstm(x)
        x = self.linear(x)
        return x

model = LSTMModel(input_size=7, hidden_size=256, output_size=1)

loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

Apply model:

# Loop over the training set
for X, Y in train_loader:

    optimizer.zero_grad()
    
    Y_pred = model(X)

    loss = loss_fn(Y_pred, Y)
    
    loss.backward()
    
    optimizer.step()

model.eval()

# Loop over the test set
for X, Y in test_loader:

    Y_pred = model(X)
    
    loss = loss_fn(Y_pred, Y)

An example of Y (true data):

tensor([[[59.],
         [59.],
         [59.],
         [59.]],

        [[70.],
         [70.],
         [70.],
         [70.]],

        [[ 100.],
         [ 0.],
         [ 0.],
         [ 0.]],

# etc.

However, my Y_pred is somewhat like this:

 tensor([[[15.8224],
         [15.8224],
         [15.8224],
         [15.8224]],

        [[16.1654],
         [16.1654],
         [16.1654],
         [16.1654]],

        [[16.2127],
         [16.2127],
         [16.2127],
         [16.2127]],

# etc.

I have tried numerous different things:

Examples of my data in a previous question.

I am fairly new with PyTorch and LSTMs so I might do it wrong, but, whatever I change, I keep getting a (near) constant value from the predictions. What am I doing wrong/what should I be doing?

Upvotes: 3

Views: 881

Answers (1)

user17515752
user17515752

Reputation: 75

I solved this by normalizing my input data. I now obtain different predictions for every output:

# Calculate the mean and standard deviation of each feature in the training set
X_mean = X_train.mean(dim=0)
X_std = X_train.std(dim=0)

# Standardize the training set
X_train = (X_train - X_mean) / X_std

# Standardize the test set using the mean and standard deviation of the training set
X_test = (X_test - X_mean) / X_std

Upvotes: 1

Related Questions