Reputation: 9
I want to do sentiment analysis using bert-embedding and lstm layer. This is my code:
i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)
model = tf.keras.Model(i, x)
When compiling this code I got the following error:
ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected
ndim=3, found ndim=2. Full shape received: (None, 768)
Is the logic of my code correct? Can anyone please correct my code?
Upvotes: 0
Views: 4184
Reputation: 373
From bert like models you can expect generally three kinds of outputs (taken from huggingface's TFBertModel documentation)
last_hidden_state
with shape (batch_size, sequence_length, hidden_size)pooler_output
with shape (batch_size, hidden_size)hidden_states
with shape (batch_size, sequence_length, hidden_size)hidden_size
is 768 above..
As the error says, the output from dropout layer lacks 3 dimensions (essentially the bert_encoder layer because dropout layers do not change tensor shape) and has only 2 dimensions.
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
So if you are planning to use an LSTM layer after the bert_encoder layer, you would need a three dimensional input to the LSTM in the form of (batch_size, num_timesteps, num_features)
hence you would have to use either the hidden_states
or the last_hidden_state
outputs instead of pooler_output
.
You will have to choose between the two depending on your objective/use-case.
Upvotes: 1