Marek Justyna
Marek Justyna

Reputation: 224

Setting up an initial state in LSTM in Tensorflow 1.9

I try to make a simple LSTM network with 2 layers stacked. To that purpose I use MultiRNNCell. I followed tutorials and other stack topics, but I still have a problem to run my network. Below you can find declaration of initial state I found on stack.

cell_count = 10 # timesteps
num_hidden = 4 # hidden layer num of features
num_classes = 1 
num_layers = 2
state_size = 4

init_c = tf.Variable(tf.zeros([batch_size, cell_count]), trainable=False)
init_h = tf.Variable(tf.zeros([batch_size, cell_count]), trainable=False)
initial_state = rnn.LSTMStateTuple(init_c, init_h) #[num_layers, 2, batch_size, state_size])

Below you can find how my model looks like:

def generate_model_graph(self, data):

    L1 = self.generate_layer(self.cell_count)
    L2 = self.generate_layer(self.cell_count)

    #outputs from L1
    L1_outs, _ = L1(data, self.initial_state)

    #reverse output array
    L2_inputs = L1_outs[::-1]

    L2_outs, _ = L2(L2_inputs, self.initial_state)
    predicted_vals = tf.add(tf.matmul(self.weights["out"], L2_outs), self.biases["out"])
    L2_out = tf.nn.sigmoid(predicted_vals)
    return L2_out



def generate_layer(self, size):
    cells = [rnn.BasicLSTMCell(self.num_hidden) for _ in range(size)]
    return rnn.MultiRNNCell(cells)

And run session:

def train_model(self, generator):
    tr, cost = self.define_model()

    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)
        for _ in range(self.n_epochs):
            batch_x, batch_y = self._prepare_data(generator)
            init_state = tf.zeros((self.cell_count, self.num_hidden))
            t, c = sess.run([tr, cost], feed_dict={self.X: batch_x, self.Y:batch_y, self.initial_state:init_state})
            print(c)

Unfortunately, I still get an error saying 'Variable' object is not iterable.

  File "detector_lstm_v2.py", line 104, in <module>
    c.train_model(data_gen)
  File "detector_lstm_v2.py", line 38, in train_model
    tr, cost = self.define_model()
  File "detector_lstm_v2.py", line 51, in define_model
    predicted_vals = self.generate_model_graph(self.X)
  File "detector_lstm_v2.py", line 65, in generate_model_graph
    L1_outs, _ = L1(data, self.initial_state)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/rnn_cell_impl.py", line 232, in __call__
    return super(RNNCell, self).__call__(inputs, state)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/base.py", line 329, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 703, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1325, in call
    cur_inp, new_state = cell(cur_inp, cur_state)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/rnn_cell_impl.py", line 339, in __call__
    *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/base.py", line 329, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 703, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/rnn_cell_impl.py", line 633, in call
    c, h = state
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 491, in __iter__
    raise TypeError("'Variable' object is not iterable.")
TypeError: 'Variable' object is not iterable.

Does any know how to solve this problem?

Upvotes: 0

Views: 2320

Answers (1)

Giuseppe Marra
Giuseppe Marra

Reputation: 1104

You are creating a multi layer rnn cell but you are passing a single state.

Use this to create your state:

initial_state = L1.zero_state()

or use it to initialize the variable if you need a variable.

There are some "naming“ problems in your code that make me think you are misunderstanding something here.

There are different parameters:

  1. The hidden size of your layers: it is the units attribute of the RNNCell constructor. All the states of your cell nees to have a shape [bacth_size, hidden_size] (and not cell count]
  2. Your cell_count in your code is not determining the length of the sequence but "how deep" your network is.
  3. The length of the sequence is automatically determined on the input sequence you are passing to your model (which needs to be a list of tensors).

I recommend you to have a look at the TF tutorial on Recurrent Neural Networks here and maybe this answer here to understand what a RNNCell is w.r.t. RNN literature (it is a layer and not a single cell).

Upvotes: 2

Related Questions