Reputation: 33
I have a simple model that flattens sequence embeddings and then sums them. When I run predict I get no errors and the output shape that I expect but when I try to train I get a shape mismatch error
Here is the model:
import numpy as np
from keras import backend as K
from keras.models import Model
from keras.layers.embeddings import Embedding
from keras.layers import Reshape, Lambda
inputs = Input(shape=(20,), name="inputs")
embedding = Embedding(69, 100, name="embeddings")(inputs)
out = Reshape((2000,), name='reshape_embeddings')(embedding)
out = Lambda(lambda x: K.sum(x, axis=1), name='sum_embeddings')(out)
model = Model(inputs, out)
model.compile('adam', 'mean_squared_error')
print(model.summary())
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
inputs (InputLayer) (None, 20) 0
_________________________________________________________________
embeddings (Embedding) (None, 20, 100) 6900
_________________________________________________________________
reshape_embeddings (Reshape) (None, 2000) 0
_________________________________________________________________
sum_embeddings (Lambda) (None,) 0
=================================================================
Total params: 6,900
Trainable params: 6,900
Non-trainable params: 0
_________________________________________________________________
Here I build a random x,y sample:
x = np.random.randint(69, size=(500,20))
y = np.random.uniform(0, 1, size=(500,))
When I predict x I get the correct output shape
preds = model.predict(x)
print(preds.shape == y.shape)
When I fit the model I get the following error:
model.fit(x, y, batch_size=50, verbose=1)
ValueError: Error when checking target: expected sum_embeddings to have 1 dimensions, but got array with shape (500, 1)
It feels like I am missing something really simple. Any suggestions would be greatly appreciated
Upvotes: 1
Views: 694
Reputation: 11333
Yes, it is a few simple issues with your code. The output of your model needs to have at least rank 2 (in this case (None,1)
) (My 2 cents is that optimizer complains when it's not). This is done using keepdims=True
. Then you have to add one dimension to y
too.
inputs = Input(shape=(20,), name="inputs")
embedding = Embedding(69, 100, name="embeddings")(inputs)
out = Reshape((2000,), name='reshape_embeddings')(embedding)
out = Lambda(lambda x: K.sum(x, axis=1, keepdims=True), name='sum_embeddings')(out)
model = Model(inputs, out)
model.compile('adam', 'mean_squared_error')
print(model.summary())
x = np.random.randint(69, size=(500,20))
y = np.random.uniform(0, 1, size=(500,1))
preds = model.predict(x)
print(preds.shape == y.shape)
model.fit(x, y, batch_size=50, verbose=1)
Upvotes: 1