Reputation: 21
I am trying to perform a sentiment classification task for which I am using Attention based architecture which has both Convolution layer and BiLSTM layers. The first layer of my model is a Embedding layer followed by a Convolution1D layer. I have used mask_zero=True
for the Embedding layer since I have padded the sequence with zeros. This however creates an error for the Convolution1D layer since this layer does not support masking. However, I do need to mask the zero inputs since I have LSTM layers after the convolutional layers. Does anyone have any solution for this. I have attached a sample code of my model till the Convolution1D layer for reference.
wordsInputs = Input(shape=(maxSeq,), name='words_input')
embed_reg = l2(l=0.001)
emb = Embedding(vocabSize, 300, mask_zero=True, init='glorot_uniform', W_regularizer=embed_reg)(wordsInputs)
convOutput = Convolution1D(nb_filter=100, filter_length=3, activation='relu', border_mode='same')(emb)
Upvotes: 0
Views: 282
Reputation: 3763
It looks like you have defined a maxSeq length and you say you are padding the sequence with zeros. The mask_zero means something else, specifically that zero is a reserved input word index that you are not supposed to use and is reserved for the internals of the program to mark the end of a variable length sequence.
I think the solution is simply to remove the parameter mask_zero=True
, as it is unneeded (because it is for variable length sequences), and to use zero as your padding word index.
Upvotes: 1