wwwwwwwwww
wwwwwwwwww

Reputation: 133

Keras LSTM how to use a model that is many-to-many when trained for prediction that is one-to-many with stateful=True?

I am learning how to use the LSTM model in Keras. I have looked at this answer and this answer, and would like to train a model in the many-to-many manner but at testing time make predictions using the one-to-many with stateful=True manner. I am unsure if I am on the right track.

I have a data set comprising of 10,000 individuals, each has a sequence of 20 timesteps and 10 features. I want to train an LSTM model to predict 5 of the features in the next timestep, using a 90-10 train and test split, my train_x is shaped (9,000, 20, 10) and my train_y is shaped (9,000, 20, 5) with the values in y being the values of the selected features in the next timestep. My test_x is shaped (1,000, 20, 10).

At test time, I would like to use the trained model to make predictions using only the 10 features at the very start of the sequence (timestep 0). First to predict the values of the selected 5 features in the next time step. The values of the other 5 features in the next timestep is known so I would like to combine them with the predicted 5 features and again use that as input to predict the 5 features in the next timestep and so on for 20 steps.

Is it possible to do this using the Keras library?

My code for training looks like

t_model = Sequential()
t_model.add(LSTM(100, return_sequence=True, 
               input_shape=(train_x.shape[1],
                            train_x.shape[2])))
t_model.add(TimeDistributed(Dense(5))
t_modle.compile(loss='mean_squared_error', 
              optimizer='adam')
checkpointer = ModelCheckpoint(filepath='weights.hdf5',
                               verbose=1, 
                               save_best_only=True)
history = t_model.fit(train_x, train_y, epochs=50, 
          validation_split=0.1, callbacks=[checkpointer], 
          verbose=2, shuffle=False) 

This seems to train ok. Please let me know if there is any misunderstanding in the way I am structuring my model.

My code for testing looks like

p_model = Sequential()
p_model.add(LSTM(100, stateful=True,
                 return_sequences=True,
                 batch_input_shape=(1, 1,
                                    test_x.shape[2])))
p_model.add(TimeDistributed(Dense(5)))
p_model.load_weights('weights.hdf5')
complete_yhat = np.empty([0, 5])
for i in range(len(test_x):
    ind = test_x[i]
    x = ind[0]
    x = x.reshape(1, 1, x.shape[0])
    for j in range(20):
        yhat = p_model.predict(x)
        complete_yhat = np.append(complete_yhat, yhat[0], axis=0)
        if j < 19:
            x = ind[j+1]
            x = np.append([x[:-5]], yhat[0], axis=1)
            x = x.reshape(1, x.shape[0], x.shape[1])
    p_model.reset_states()

This runs ok, but I am struggling to get good forecast accuracy. Can someone let me know whether I am using Keras LSTM correctly?

Thank you for your help

Upvotes: 0

Views: 529

Answers (1)

SaTa
SaTa

Reputation: 2692

I am not sure if you can really train a model with many-to-many architecture and then test it one-to-many. You might be able to hack something and have a piece of code that runs, but from a technical point of view this does not make much sense. Can you explain why you want to do one-to-many in test time?

Generally, the rule of thumb for any supervised machine learning model development is that your training phase should "resemble" you testing phase. For example, if you want to test one-to-many architecture, then you should also train it as one-to-many.

Edit:

Reading the comments, it seems that you want to train with features from one time step and see how it will perform for future time steps. (I think this is in odds with the nature of a time-series data where every sample contributes to the future state, and if one sample can predict the future very well, then it means that the next samples are useless... but anyways). Here is how you can do this. There are other ways of course...

Split your data for training and testing similar to what you are doing for test time. so your input should by of shape (None, 10) and output of the shape (None, 20, 5). Then use Keras RepeatVector at your input (like this output = RepeatVector(20)(input) and then you should get something of the shape (None, 20, 10) which you can now pass through the rest of your model.

Upvotes: 0

Related Questions