Eudie
Eudie

Reputation: 692

Tensorflow: tf.nn.dynamic_rnn() :Unable to gather output values from last dimension of dynamic time-major

I am trying to classify news articles using RNN. Since the length of news articles are not fixed I am using tf.nn.dynamic_rnn().

# ....{Some Code Above}.....
with graph.as_default():
    sentences = tf.placeholder(tf.float32, shape=(batch_size, None, emmbedding_dimension))
    sequence_length = tf.placeholder(tf.float32, shape=batch_size)
    labels = tf.placeholder(tf.float32, shape=(batch_size, 2))

    lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=lstm_size)
    stacked_lstm = tf.nn.rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=1)
    stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([stacked_lstm] * no_of_lstm_layers)
    outputs, states = tf.nn.dynamic_rnn(cell=stacked_lstm,
                                        inputs=sentences,
                                        sequence_length=sequence_length,
                                        initial_state=stacked_lstm.zero_state(batch_size, tf.float32))
# ....{Some Code Below}.....

The tensor shape of 'outputs' from the code above is (batch_size, ?, lstm_size).

I want to gather the output at the end of sentences, which is dynamic. I am using the following command for that

outputs = tf.transpose(outputs, [1, 0, 2])
last = tf.gather(outputs, int(outputs.get_shape()[0]) - 1)

I am getting the following error,

Traceback (most recent call last):
  File "./rnn_fitness_level1_0.py", line 127, in <module>
    last = tf.gather(outputs, int(outputs.get_shape()[0]) - 1)
TypeError: __int__ returned non-int (type NoneType)

I believe this error is because of dynamic shape of outputs across time_major(sentence_major).

In other words, result of "outputs.get_shape()[0]" is "?"(None)

Above technique for getting last output works when we use fixed time_major(sentence length).

Is there a way to achieve it for dynamic time_major(sentence length)?

As of now, I am doing the following

last = tf.reduce_mean(outputs, [0])

But my understanding is, by doing the mean across the time_major(sentence length), I am not using the potential of RNN of finding the sequential pattern. Please let me know your view on the same.

Upvotes: 0

Views: 966

Answers (1)

Peter Hawkins
Peter Hawkins

Reputation: 3211

In general, get_shape() is best-effort. Tensorflow does not always know the shape of a Tensor before the graph runs.

There are a number of things you could try. One is to compute the offset of the last index yourself in Python without using get_shape; if you know the sizes of the inputs, this should not be difficult.

Another option would be to use Tensorflow's slice functionality, which supports a Numpy-style "-1" index to represent the last element. For example, if x is a 3D Tensor, x[:, -1, :] should slice out the last element of the middle dimension.

For more documentation, see the tf.Tensor.__getitem__ documentation here: https://www.tensorflow.org/api_docs/python/framework/core_graph_data_structures#Tensor

Hope that helps!

Upvotes: 1

Related Questions