Reputation: 91
Hi I have a bidirectional LSTM layer:
class BiDirLSTMInput(Layer):
def __init__(self):
self.bidir_lstm = Bidirectional(
LSTM(32, return_sequences=True,return_state=True)
)
def call(self, input):
o, h1,h2, c1,c2 = self.bidir_lstm(input)
return [h1,h2]
As you can see, I am just consuming hidden state from LSTM (and not cell state)
Is that the reason, am I getting following warning:
WARNING:tensorflow:Gradients do not exist for variables for (backward layer):
Ignoring this doesn't sound logical. How do I deal with this error?
Upvotes: 0
Views: 475
Reputation: 91
Okay I am finally able to resolve this warning.
This was bit tricky to find it out what was wrong.
So basically what's happening was:
def call(self, input):
o, h1,h2, c1,c2 = self.bidir_lstm(input)
return (h1,h2)
As you can see, I was just consuming hidden state and not cell state.
That's the reason you see that gradient warning.
Solution is:
You just consume output state
If you want to consume hidden state then also consume cell state.
This you can do via multiple ways:
a) h_and_c = concat(h,c)
b) h_and_c_avg = avg(h,c)
c) h_and_c_sum = sum(h,c)
Note I have tested this with Tensorflow Keras Bidirectional LSTM (I haven't checked with just LSTM)
Upvotes: 1
Reputation: 71
I think, yes. Could you try consuming the hidden states
as well and then observe whether you get the same warning or not?
Upvotes: 0