Ryan Walden
Ryan Walden

Reputation: 167

How to correctly shape my CNN-LSTM input layer

I have a data set with the shape (3340, 6). I want to use a CNN-LSTM to read a sequence of 30 rows and predict the next row's (6) elements. From what I have read, this is considered a multi-parallel time series. I have been primarily following this machine learning mastery tutorial and am having trouble implementing the CNN-LSTM architecture for a multi-parallel time series.

I have used this function to split the data into 30 day time step frames

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the dataset
        if end_ix > len(sequences)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

Here is a sample of the data frames produced by the function above.

   # 30 Time Step Input Frame X[0], X.shape = (3310, 30, 6)
   [4.951e-02, 8.585e-02, 5.941e-02, 8.584e-02, 8.584e-02, 5.000e+00],
   [8.584e-02, 9.307e-02, 7.723e-02, 8.080e-02, 8.080e-02, 4.900e+01],
   [8.080e-02, 8.181e-02, 7.426e-02, 7.474e-02, 7.474e-02, 2.000e+01],
   [7.474e-02, 7.921e-02, 6.634e-02, 7.921e-02, 7.921e-02, 4.200e+01],
   ...

   # 1 Time Step Output Array y[0], y.shape = (3310, 6)
   [6.550e-02, 7.690e-02, 6.243e-02, 7.000e-02, 7.000e-02, 9.150e+02]

Here is the following model that I am using:

model = Sequential()
model.add(TimeDistributed(Conv1D(64, 1, activation='relu'), input_shape=(None, 30, 6)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(Dense(6))
model.compile(optimizer='adam', loss='mse')

When I run model.fit, I receive the following error:

ValueError: Error when checking input: expected time_distributed_59_input to have 
4 dimensions, but got array with shape (3310, 30, 6)

I am at a loss at how to properly shape my input layer so that I can get this model learning. I have done several Conv2D nets in the past but this is my first time series model so I apologize if there's an obvious answer here that I am missing.

Upvotes: 0

Views: 565

Answers (1)

OverLordGoldDragon
OverLordGoldDragon

Reputation: 19776

  • Remove TimeDistributed from Conv1D and MaxPooling1D; 3D inputs are supported
  • Remove Flatten(), as it destroys timesteps-channels relationships
  • Add TimeDistributed to the last Dense layer, as Dense does not support 3D inputs (returned by LSTM(return_sequences=True); alternatively, use return_sequences=False)

Upvotes: 2

Related Questions