Anonymous
Anonymous

Reputation: 1

Keras LSTM Input - ValueError: Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (1745, 1)

My Keras RNN code is as follows:

def RNN(): 
   inputs = Input(shape = (None, word_vector_size))
   layer = LSTM(64)(inputs)
   layer = Dense(256,name='FC1')(layer)
   layer = Dropout(0.5)(layer)
   layer = Dense(num_classes,name='out_layer')(layer)
   layer = Activation('softmax')(layer)
   model = Model(inputs=inputs,outputs=layer)
   return model

I'm getting the error when I call model.fit()

model.fit(np.array(word_vector_matrix), np.array(Y_binary), batch_size=128, epochs=10, validation_split=0.2, callbacks=[EarlyStopping(monitor='val_loss',min_delta=0.0001)])

Word_vector_matrix is a 3-dim numpy array. I have printed the following :

print(type(word_vector_matrix), type(word_vector_matrix[0]), type(word_vector_matrix[0][0]), type(word_vector_matrix[0][0][0]))

and the answer is :

<class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.float32'>

It's shape is 1745 x sentence length x word vector size. The sentence length is variable and I'm trying to pass this entire word vector matrix to the RNN, but I get the error above.

The shape is printed like:

print(word_vector_matrix.shape)

The answer is (1745,)

The shape of the nested arrays are printed like:

print(word_vector_matrix[10].shape)

The answer is (7, 300) The first number 7 denotes the sentence length, which is variable and changes for each sentence, and the second number is 300, which is fixed for all words and is the word vector size.

I have converted everything to np.array() as suggested by the other posts, but still the same error. Can someone please help me. I'm using python3 btw. The similar thing is working in python2 for me, but not in python3. Thanks!

Upvotes: 0

Views: 355

Answers (1)

Mark Loyman
Mark Loyman

Reputation: 2170

word_vector_matrix is not a 3-D ndarray. It's a 1-D ndarray of 2-D arrays. This is due to variable sentence length.

Numpy allows ndarray to be list-like structures that may contain a complex element (another ndarray). In Keras however, the ndarray must be converted into a Tensor (which has to be a "mathematical" matrix of some dimension - this is required for the sake of efficient computation).

Therefore, each batch must have fixed size sentences (and not the entire data).

Here are a few alternatives:

  1. Use batch size of 1 - simplest approach, but impedes your network's convergence. I would suggest to only use it as a temporary sanity check.
  2. If sequence length variability is low, pad all your batches to be of the same length.
  3. If sequence length variability is high, pad each batch with the max length within that batch. This would require you to use a custom data generator.

Note: After you padded your data, you need to use Masking, so that the padded part will be ignored during training.

Upvotes: 1

Related Questions