Cristian Neufuss
Cristian Neufuss

Reputation: 23

How to solve "logits and labels must have the same first dimension" error

I'm trying out different Neural Network architectures for a word based NLP.

So far I've used bidirectional-, embedded- and models with GRU's guided by this tutorial: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 and it all worked out well. When I tried using LSTM's however, I get an error saying:

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]

How can I solve this?

My source and target dataset consists of 7200 sample sentences. They are integer tokenized and embedded. The source dataset is post padded to match the length of the target dataset.

Here is my model and the relevant code:

lstm_model = Sequential()
lstm_model.add(Embedding(src_vocab_size, 128, input_length=X.shape[1], input_shape=X.shape[1:]))
lstm_model.add(LSTM(128, return_sequences=False, dropout=0.1, recurrent_dropout=0.1))
lstm_model.add(Dense(128, activation='relu'))
lstm_model.add(Dropout(0.5))
lstm_model.add((Dense(target_vocab_size, activation='softmax')))

lstm_model.compile(optimizer=Adam(0.002), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = lstm_model.fit(X, Y, batch_size = 32, callbacks=CALLBACK, epochs = 100, validation_split = 0.25) #At this line the error is raised!

With the shapes:

I've looked at similar question on here already and tried adding a Reshape layer

simple_lstm_model.add(Reshape((-1,)))

but this only causes the following error:

"TypeError: __int__ returned non-int (type NoneType)"

It's really weird as I preprocess the dataset the same way for all models and it works just fine except for the above.

Upvotes: 0

Views: 143

Answers (1)

Jindřich
Jindřich

Reputation: 11220

You should have return_sequences=True and return_state=False in calling the LSTM constructor.

In your snippet, the LSTM only return its last state, instead of the sequence of states for every input embedding. In theory, you could have spotted it from the error message:

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]

The logits should be three-dimensional: batch size × sequence length × number of classes. The length of the sequences is 147 and indeed 32 × 147 = 4704 (number of your labels). This could have told you the length of the sequences disappeared.

Upvotes: 1

Related Questions