Lostsoul
Lostsoul

Reputation: 25999

layer bidirectional is incompatible with the layer when trying to connect dense layer to LSTM

I'm playing with a multiclass classification problem and for fun I wanted to try different models. I found a blog that used LSTM for classification and was trying to adjust my model to work.

Here is my model:

  from tensorflow import keras
  from tensorflow.keras.models import Sequential
  from tensorflow.keras.layers import Dense, Dropout, Activation, Bidirectional, LSTM
  from tensorflow.keras.optimizers import SGD, Adam 

  x_train_shape = X_train.shape[1]
  model = Sequential()
  model.add(Dense(x_train_shape, activation='tanh', input_dim=x_train_shape))
  # model.add(Dropout(0.2))
  model.add(Bidirectional(LSTM(32)))
  
  # model.add(Dense(x_train_shape, activation='tanh'))
  # model.add(Dense(x_train_shape, activation='tanh'))
  model.add(Dense(len(labels), activation='softmax'))


  model.compile(loss='categorical_crossentropy',
                optimizer="adam", metrics=['accuracy', 'TopKCategoricalAccuracy', 'FalsePositives'])

  model.fit(X_train, y_train, epochs=500, batch_size=200)

It returns this error:

ValueError: Input 0 of layer bidirectional_5 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 109]

If I uncomment the dense layers under LSTM and comment the LSTM, the model works so it's def related to the LSTM line.

How can I connect an LSTM layer to a dense for multiclass classification?

Upvotes: 0

Views: 1156

Answers (1)

Nicolas Gervais
Nicolas Gervais

Reputation: 36614

Try putting a TimeDistributed layer around the Dense layer. Here's an example with bogus data:

from tensorflow import keras
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import *

X_train = np.random.rand(100, 1, 10)
y_train = np.random.randint(0, 10, 100)
y_train = keras.utils.to_categorical(y_train)

assert X_train.ndim == 3

model = Sequential()
model.add(TimeDistributed(Dense(10), input_shape=(X_train.shape[1:])))
model.add(Bidirectional(LSTM(8)))

model.add(Dense(8, activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(y_train.shape[-1], activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer="adam")

history = model.fit(X_train, y_train, epochs=1, batch_size=8)
Train on 100 samples
  8/100 [=>............................] - ETA: 0s - loss: 2.2984
 80/100 [=======================>......] - ETA: 0s - loss: 2.2863
100/100 [==============================] - 0s 950us/sample - loss: 2.2984

Upvotes: 1

Related Questions