Reputation: 87
I want to build a model similar to this architecture:-
My current LSTM model is as follows:-
x = Embedding(max_features, embed_size, weights=[embedding_matrix],trainable=False)(inp)
x = SpatialDropout1D(0.1)(x)
x = Bidirectional(CuDNNLSTM(128, return_sequences=True))(x)
x = Bidirectional(CuDNNLSTM(64, return_sequences=True))(x)
avg_pool = GlobalAveragePooling1D()(x)
max_pool = GlobalMaxPooling1D()(x)
conc = concatenate([avg_pool, max_pool])
conc = Dense(64, activation="relu")(conc)
conc = Dropout(0.1)(conc)
outp = Dense(1, activation="sigmoid")(conc)
model = Model(inputs=inp, outputs=outp)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[f1])
How to use the Conv2D layer after the BiLSTM later with 2D Max Pooling layer ?
Upvotes: 1
Views: 1565
Reputation: 2016
There are few important points you need to pay attention to in order to create this (fairly complicated) model.
Here is the model itself, created using the functional API:
def expand_dims(x):
return K.expand_dims(x, -1)
inp = Input(shape=(3,3))
lstm = Bidirectional(LSTM(128, return_sequences=True))(inp)
lstm = Lambda(expand_dims)(lstm)
conv2d = Conv2D(filters=128, kernel_size=2, padding='same')(lstm)
max_pool = MaxPooling2D(pool_size=(2, 2),)(conv2d)
predictions = Dense(10, activation='softmax')(max_pool)
model = Model(inputs=inp, outputs=predictions)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
First, create your input shape. From the image above it looks like you work with 7 samples, a window of 3 and 3 features -> a tensor of shape (7, 3, 3)
. Obviously you can change to whatever you like. Use the input layer for your bidirectional LSTM layer.
inp = Input(shape=(3,3))
lstm = Bidirectional(LSTM(128, return_sequences=True))(inp)
Second, as @Amir mentioned you need to expand the dimensions if you want to used a Conv2D
layer. However only using keras backend is not sufficient as the model created by the functional api would require you to have only keras layers in it. Check this answer here for error NoneType' object has no attribute '_inbound_nodes
. Therefore, you need to extract your expand_dim
into its own function and wrap around a Lambda
layer:
def expand_dims(x):
return K.expand_dims(x, -1)
lstm = Lambda(expand_dims)(lstm)
The rest is pretty straight forward once the above is sorted:
conv2d = Conv2D(filters=128, kernel_size=2, padding='same')(lstm)
max_pool = MaxPooling2D(pool_size=(2, 2),)(conv2d)
predictions = Dense(10, activation='softmax')(max_pool)
model = Model(inputs=inp, outputs=predictions)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
The summary of the model looks like:
Layer (type) Output Shape Param #
=================================================================
input_67 (InputLayer) (None, 3, 3) 0
_________________________________________________________________
bidirectional_29 (Bidirectio (None, 3, 256) 135168
_________________________________________________________________
lambda_7 (Lambda) (None, 3, 256, 1) 0
_________________________________________________________________
conv2d_19 (Conv2D) (None, 3, 256, 128) 640
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 1, 128, 128) 0
_________________________________________________________________
dense_207 (Dense) (None, 1, 128, 10) 1290
=================================================================
Total params: 137,098
Trainable params: 137,098
Non-trainable params: 0
_________________________________________________________________
None
And here is the visualisation:
Upvotes: 1
Reputation: 16587
Conv2d needs 4D tensor with shape: (batch, rows, col, channel)
. In NLP problems, unlike computer vision, we do not have a channel. What can be done?
We can add an extra dimension with expand_dims
function to our Tensors that act as a channel. For example, if our tensor has a shape of (batch, seq, dim)
then, after expansion, it converts to (batch, seq, dim, 1)
.
lstm = Bidirectional(LSTM(128, return_sequences=True))(embed)
lstm = K.expand_dims(lstm, axis=-1)
conv2d = Conv2D(filters=128, kernel_size=2, padding='same')(lstm)
Upvotes: 1