CarterB
CarterB

Reputation: 542

Reshaping data to fit a multivariate LSTM time series model with time distributed wrapper

I have a dataset comprising of hourly data from the past 7 years.

I am trying to use train to predict one variable (price) 24 hours in advance, based on 168 hours (1 week) of historic data for all variables, including price.

To do this I am attempting to build a NN with an LSTM layer and a time distributed layer, however am struggling to understand the shape coming of the data returning from each layer.

My code is as follows:

X_train.shape, Y_train.shape, X_valid.shape, Y_valid.shape, X_test.shape, Y_test.shape

((43800, 168, 6),
 (43800, 24),
 (8760, 168, 6),
 (8760, 24),
 (8574, 168, 6),
 (8574, 24))

So the training data (X) is made up of 43800 samples, looking back over 168 hours, for 6 features. The Y is 43800 samples, prediction for every hour for 24 hours in advance. This is currently where I am in attempting to run the model

model8 = keras.models.Sequential([
    keras.layers.LSTM(10, input_shape=(X_train.shape[1], X_train.shape[2]), return_sequences = True),
    keras.layers.LSTM(20, return_sequences= True),
    keras.layers.TimeDistributed(keras.layers.Dense(24))
])

model8.compile(loss="mape", optimizer="adam")
history = model8.fit(X_train, Y_train, epochs=2,
                    validation_data=(X_valid, Y_valid))

ValueError: Error when checking target: expected time_distributed_26 to have shape (168, 24) but got array with shape (24, 1)

Any help would be greatly appreciated, as I don't fully understand why the time distributed layer expects all 168 hours of the past (with 24 features?), and not just the prediction of the future.

Upvotes: 0

Views: 736

Answers (1)

Michael Grogan
Michael Grogan

Reputation: 1026

Let's consider X_train.shape and Y_train.shape in the first instance.

X_train.shape is structured as follows: (samples, time steps, features).

Given (43800, 168, 6), this means that:

  • 43800 observations are being used to train the model

  • There are 168 time steps in the model, i.e. steps back in time that the model uses to calibrate weight updates.

  • There are 6 features in the model, i.e. each timestep in the model is comprised of six features.

Now, if you are specifying X_train to consider 168 timesteps, then the LSTM model expects that Y_train is comprised of 168 timesteps, i.e. Y_train takes the shape (time steps, samples).

Your specification for Y_train is erroneous as it specifies 24 time steps with only 1 sample whereas the LSTM model is expecting 168 time steps with 24 samples, as was specified under X_train.

However many time steps you choose - they must be consistent across both X_train and Y_train. Hope this helps.

Upvotes: 2

Related Questions