Reputation: 25999
I'm playing with a multiclass classification problem and for fun I wanted to try different models. I found a blog that used LSTM for classification and was trying to adjust my model to work.
Here is my model:
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Bidirectional, LSTM
from tensorflow.keras.optimizers import SGD, Adam
x_train_shape = X_train.shape[1]
model = Sequential()
model.add(Dense(x_train_shape, activation='tanh', input_dim=x_train_shape))
# model.add(Dropout(0.2))
model.add(Bidirectional(LSTM(32)))
# model.add(Dense(x_train_shape, activation='tanh'))
# model.add(Dense(x_train_shape, activation='tanh'))
model.add(Dense(len(labels), activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer="adam", metrics=['accuracy', 'TopKCategoricalAccuracy', 'FalsePositives'])
model.fit(X_train, y_train, epochs=500, batch_size=200)
It returns this error:
ValueError: Input 0 of layer bidirectional_5 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 109]
If I uncomment the dense layers under LSTM and comment the LSTM, the model works so it's def related to the LSTM line.
How can I connect an LSTM layer to a dense for multiclass classification?
Upvotes: 0
Views: 1156
Reputation: 36614
Try putting a TimeDistributed
layer around the Dense
layer. Here's an example with bogus data:
from tensorflow import keras
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import *
X_train = np.random.rand(100, 1, 10)
y_train = np.random.randint(0, 10, 100)
y_train = keras.utils.to_categorical(y_train)
assert X_train.ndim == 3
model = Sequential()
model.add(TimeDistributed(Dense(10), input_shape=(X_train.shape[1:])))
model.add(Bidirectional(LSTM(8)))
model.add(Dense(8, activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(y_train.shape[-1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam")
history = model.fit(X_train, y_train, epochs=1, batch_size=8)
Train on 100 samples
8/100 [=>............................] - ETA: 0s - loss: 2.2984
80/100 [=======================>......] - ETA: 0s - loss: 2.2863
100/100 [==============================] - 0s 950us/sample - loss: 2.2984
Upvotes: 1