
Reputation: 1

Train LSTM for time series with varying lengths

I'm training a LSTM for time series prediction, where data comes from sensors at irregular intervals. I'm using the last 5 min data to predict the next value, but some sequences are larger than others.

My input array's shape is (611,1200,15) where (sample, timesteps, features). The second dimension is not completed for every sample, so i padded the missing data with np.nan values. For instance, sample (1,:,:) has 1000 timesteps and 200 np.nan.

While training, loss equals nan.

What am i doing wrong? How can I train it?

Here's my attempt to train the LSTM:

def lstmFit(y, X, n_hidden=1, n_neurons=30, learning_rate=1e-2):   
    lstm = Sequential()
    lstm.add(Masking(mask_value=np.nan, input_shape=(None, X.shape[2])))
    for layer in range(n_hidden):
                      recurrent_activation = "sigmoid",
    lstm.compile(loss="mse", optimizer="adam")
    early_stopping = EarlyStopping(monitor='loss', patience=10, verbose=1, restore_best_weights=True)
  , y.reshape(-1), epochs=100, callbacks=[early_stopping])
    y_train_fit = lstm.predict(X)
    return lstm, y_train_fit

The model's summary:

Model: "sequential_9"
 Layer (type)                Output Shape              Param #   
 masking_7 (Masking)         (None, None, 15)          0         
 lstm_6 (LSTM)               (None, None, 30)          5520      
 dense_10 (Dense)            (None, None, 1)           31        
Total params: 5551 (21.68 KB)
Trainable params: 5551 (21.68 KB)
Non-trainable params: 0 (0.00 Byte)

And the first epochs of training:

Epoch 1/100
18/18 [==============================] - 20s 335ms/step - loss: nan
Epoch 2/100
18/18 [==============================] - 6s 335ms/step - loss: nan
Epoch 3/100
18/18 [==============================] - 7s 365ms/step - loss: nan

Upvotes: 0

Views: 20

Answers (1)

Mete Han Kahraman
Mete Han Kahraman

Reputation: 760

Assuming nans are at the end of the sequences, you can try:

  • Replacing all nan values in the input with 0. Or other values if they make more sense.

  • Cutting the sequences so length of the sequences is equal to minimal sequence length.

  • Duplicating last or first data point in sequences so all sequences are of same length(max sequence length).

Pick whichever makes more sense in your case. If you don't know which is better, try them all and compare the results.

Upvotes: 0

Related Questions