Reputation: 2741
I have varying length inputs. (below is the sample inputs)
[0.501757009346, 0.554708349218]
[0.460997102135, 0.554708349218]
[0.377844867627]
[0.328125, 0.554708349218]
[-0.266091572661, 0.554708349218, 0.554708349218]
[0.514723203769]
[0.104587155963, 0.554708349218]
[0.247003647733, 0.554708349218]
[0.586212380233]
[0.559979406212, 0.554708349218]
[0.412262156448, 0.554708349218]
So, I have padded the input sequence as follows-
In [115]: from keras.preprocessing.sequence import pad_sequences
In [116]: max_sequence_length = max([len(i) for i in X])
In [117]: padded_sequences = pad_sequences(X, max_sequence_length).tolist()
In [118]: X_padd=np.array(padded_sequences)
In [119]: X_padd.shape
Out[119]: (13189, 694)
Now I need to reshape the input to be of [samples, time steps, features] to implement LSTM layer as per keras documentation.
But when i reshape the input padded array as -
X_reshaped = X_padd.reshape(X_padd.shape[1], max_sequence_length, X_padd.shape[0])
It throws the below error. Please help me resolve this. Thanks.
In [120]: X_reshaped = X_padd.reshape(X_padd.shape[1], max_sequence_length, X_padd.shape[0])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-120-86980292fb31> in <module>()
----> 1 X_reshaped = X_padd.reshape(X_padd.shape[1], max_sequence_length, X_padd.shape[0])
ValueError: total size of new array must be unchanged
max_sequence_length = max([len(i) for i in X])
padded_sequences = pad_sequences(X, max_sequence_length).tolist()
X_padd=np.array(padded_sequences) # shape -> (13023, 694)
X_reshaped = X_padd.reshape(X_padd.shape[0],X_padd.shape[1],1)
X_train, X_test, Y_train, Y_test = cross_validation.train_test_split(X_reshaped,Y,test_size=0.2,random_state=42)
input_length = X_train.shape[0]
input_dim = X_train.shape[1]
model=Sequential()
model.add(LSTM(4, input_dim=input_dim, input_length=input_length))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, Y_train, nb_epoch=50, batch_size=12)
on fitting data to the model, below is the error I am getting-
Exception: Error when checking model input: expected lstm_input_4 to have shape (None, 10418, 694) but got array with shape (10418, 694, 1)
Upvotes: 3
Views: 675
Reputation: 11553
As I understand it you don't have features here. You have sequences of numbers, not sequences of vectors. Your shape is (n_samples, time_step)
.
So If you want to make a 3D tensor to input :
X_Reshaped = X_pad.reshape(X_pad[0], X_pad[1], 1)
Remember that X_pad[1]
is your max_sequence_length
. So you were trying to reshape a tensor shape(13189,694)
into a (13189,694,694)
. The second one has more values, hence the complaining.
I hope this helps
EDIT :
Your training data has a shape (n_samples, time_steps, num_feat)
after the reshape.
Therefore, the input data to your lstm will have a shape of (batch_size, time_steps, features)
. So when you specify input_length
and input_dim
you should put the time_steps and the num_feat values instead of n_samples and time_steps.
So change :
input_length = X_train.shape[1]
input_dim = X_train.shape[2]
Upvotes: 2