Reputation: 7056
I am trying to learn the keras functional API through the tutorials from Keras, and when I try to modify the example, I seem to get a shape mismatch. The only difference between the tutorial code and the one below is that I remove the embedding layer since mine is a regression problem.
Firstly, I am aware that LSTM expects 3 dimensions. In my example, I have:
TRAIN_BATCH_SIZE=32
MODEL_INPUT_BATCH_SIZE=128
headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100)).astype(np.float32)
additional_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 5)).astype(np.float32)
labels = np.random.randint(0, 1 + 1, size=(MODEL_INPUT_BATCH_SIZE, 1))
main_input = Input(shape=(100,), dtype='float32', name='main_input')
lstm_out = LSTM(32)(main_input)
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])
# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
# This defines a model with two inputs and two outputs:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
epochs=2, batch_size=TRAIN_BATCH_SIZE)
When I run the above, I get:
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2
So, I tried changing my input shape like so:
main_input = Input(shape=(100,1), dtype='float64', name='main_input')
and when I run this, I get:
ValueError: Error when checking input: expected main_input to have 3 dimensions, but got array with shape (128, 100)
I am perplexed and lost as to where the error is coming from. Would really appreciate some guidance on this.
EDIT
I have also tried setting:
headline_data = np.expand_dims(headline_data, axis=2)
and then used,
main_input = Input(shape=headline_data.shape, dtype='float64', name='main_input')
then, I get:
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4
seems really strange!
Upvotes: 0
Views: 959
Reputation: 1246
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2
Your problem is with the shape of your data.
headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100))
headline_data.shape
returns
(128,100)
However this should have three dimensions.
Without doublechecking you probably need to do something like:
headline_data.reshape(128,1,100)
Have a look at this post, it should clear everything up.
* UPDATE *
Do the following:
headling_data = healdine_data.reshape(128,1,100)
main_input = Input(shape=(1,100), dtype='float32', name='main_input')
I tested it and it works, so let me know if it doesnt for you =)
---- Complete Code: ----
import numpy as np
from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.layers import Input, LSTM, Dense
TRAIN_BATCH_SIZE=32
MODEL_INPUT_BATCH_SIZE=128
headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100)).astype(np.float32)
headline_data.shape
lstm_data = headline_data.reshape(MODEL_INPUT_BATCH_SIZE,1,100)
additional_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 5)).astype(np.float32)
labels = np.random.randint(0, 1 + 1, size=(MODEL_INPUT_BATCH_SIZE, 1))
main_input = Input(shape=(1,100), dtype='float32', name='main_input')
lstm_out = LSTM(32)(main_input)
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])
# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
# This defines a model with two inputs and two outputs:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# And trained it via:
model.fit({'main_input': lstm_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
epochs=1000, batch_size=TRAIN_BATCH_SIZE)
Upvotes: 3