user3139545
user3139545

Reputation: 7374

Connecting LSTM to a fully connected layer in Tensorflow

Im trying to make a connection from an LSTM layer to a fully connected layer in Tensorflow using the following code:

# lstm_outputs has shape 1x10x100
rnn_out = tf.reshape(lstm_outputs, [-1, 100])
# rnn_out has shape 10x100

Now I want to add a 1x10 vector to the output from the RNN and feed this new tensor into a fully connected layer.

extra_params = tf.placeholder(shape=[1,10], dtype=tf.float32)
fc_input = tf.concat(1,[rnn_out,extra_params])
fc1 = slim.fully_connected(fc_input,o_size,
    activation_fn=tf.nn.relu,
    weights_initializer=tf.truncated_normal_initializer(),
    biases_initializer=None)

However the code gives me the following error on the tf.concat line:

TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

I have two questions related to this code:

  1. What do I need to do to get the expected tensor to feed into the fully connected layer?
  2. What am I actually feeding into my fully connected layer? Is it a 1x1010 tensor or is it a 10x110 tensor?

Upvotes: 2

Views: 2792

Answers (1)

mrry
mrry

Reputation: 126154

Assuming that you're using TensorFlow 1.0, the TypeError is caused by the incorrect order of arguments to tf.concat() (which switched sometime around TensorFlow 0.12): it expects a list of tf.Tensor objects first, followed by the axis along which you want to concatenate those tensors.

However, if you simply reverse the arguments (tf.concat([rnn_out, extra_params], 1)) you will get a shape-related error. tf.concat() requires that all of its inputs have the same dimensions, except in the axis along which you are concatenating. However, rnn_out is a 10 x 100 matrix, and extra_params is a 1 x 10 matrix, so these tensors are not compatible for concatenation.

The correct solution depends on what the extra_params are supposed to represent. For example, if 10 is the batch size in your training, you might transpose extra_params into a 10 x 1 matrix. The following program should work:

rnn_out = tf.reshape(lstm_outputs, [-1, 100])
extra_params = tf.placeholder(shape=[10, 1], dtype=tf.float32)
fc_input = tf.concat([rnn_out, extra_params], 1)

Upvotes: 3

Related Questions