LLB
LLB

Reputation: 91

For LSTM variables I get error "tensorflow:Gradients do not exist for variables"

Hi I have a bidirectional LSTM layer:

class BiDirLSTMInput(Layer):

  def __init__(self):
    self.bidir_lstm = Bidirectional(
                         LSTM(32, return_sequences=True,return_state=True)
                       )


  def call(self, input):
    o, h1,h2, c1,c2 = self.bidir_lstm(input)
    return [h1,h2]

As you can see, I am just consuming hidden state from LSTM (and not cell state)

Is that the reason, am I getting following warning:

WARNING:tensorflow:Gradients do not exist for variables for (backward layer):

  1. lstm_cell_2/kernel:0',
  2. lstm_cell_2/recurrent_kernel:0'
  3. lstm_cell_2/bias:0'

Ignoring this doesn't sound logical. How do I deal with this error?

Upvotes: 0

Views: 475

Answers (2)

LLB
LLB

Reputation: 91

Okay I am finally able to resolve this warning.

This was bit tricky to find it out what was wrong.

So basically what's happening was:

def call(self, input):
    o, h1,h2, c1,c2 = self.bidir_lstm(input)
    return (h1,h2)

As you can see, I was just consuming hidden state and not cell state.

That's the reason you see that gradient warning.

Solution is:

  1. You just consume output state

  2. If you want to consume hidden state then also consume cell state.

    This you can do via multiple ways:

    a) h_and_c = concat(h,c)

    b) h_and_c_avg = avg(h,c)

    c) h_and_c_sum = sum(h,c)

Note I have tested this with Tensorflow Keras Bidirectional LSTM (I haven't checked with just LSTM)

Upvotes: 1

Programmer
Programmer

Reputation: 71

I think, yes. Could you try consuming the hidden states as well and then observe whether you get the same warning or not?

Upvotes: 0

Related Questions