Reputation: 4547
I am working on a model based on this paper and I am getting an exception due to GlobalMaxPooling1D
layer not supporting masking.
I have an Embedding
layer with mask_zero
argument set to True
. However, since a subsequent GlobalMaxPooling1D
layer does not support masking, I am getting an exception. The exception is expected, as it is actually stated in the documentation of the Embedding layer that any subsequent layers after an Embedding
layer with mask_zero = True
should support masking.
However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding
layer. (i.e. due to the varying length of input) My question is, how should I alter my model that masking remains a part of the model, and does not cause a problem at GlobalMaxPooling1D
layer?
Below is the code for the model.
model = Sequential()
embedding_layer = Embedding(dictionary_size, num_word_dimensions,
weights=[embedding_weights], mask_zero=True,
embeddings_regularizer=regularizers.l2(0.0001))
model.add(TimeDistributed(embedding_layer,
input_shape=(max_conversation_length, timesteps)))
model.add(TimeDistributed(Bidirectional(LSTM(m // 2, return_sequences=True,
kernel_regularizer=regularizers.l2(0.0001)))))
model.add(TimeDistributed(Dropout(0.2)))
model.add(TimeDistributed(GlobalMaxPooling1D()))
model.add(Bidirectional(LSTM(h // 2, return_sequences = True,
kernel_regularizer=regularizers.l2(0.0001)),
merge_mode='concat'))
model.add(Dropout(0.2))
crf = CRF(num_tags, sparse_target=False, kernel_regularizer=regularizers.l2(0.0001))
model.add(crf)
model.compile(optimizer, loss = crf.loss_function, metrics=[crf.accuracy])
Upvotes: 1
Views: 396
Reputation: 33460
However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding layer.
Are you padding the sentences to make them have equal lengths? If so, then instead of masking, you can let the model find out on its own that the 0 is padding and therefore should be ignored. Therefore, you would not need an explicit masking. This approach is also used for dealing with missing values in the data as suggested in this answer.
Upvotes: 2