Marios Mourelatos
Marios Mourelatos

Reputation: 403

Confused on how to run tensorflow LSTM

I have seen two different ways of calling lstm on tensorflow and I am confused on what is the difference of one method with the other. And in which situation to use one or the other

The first one is to create an lstm and then call it immediatly like the code below

lstm = rnn_cell.BasicLSTMCell(lstm_size)
# Initial state of the LSTM memory.
initial_state = tf.zeros([batch_size, lstm.state_size])

for i in range(num_steps):
    # The value of state is updated after processing each batch of words.
    output, state = lstm(words[:, i], state)

And the second one is call lstm cell through rnn.rnn() like below.

# Define a lstm cell with tensorflow
lstm = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Split data because rnn cell needs a list of inputs for the RNN inner loop
inputToLstmSplited = tf.split(0, n_steps, inputToLstm) # n_steps * (batch_size, n_hidden)

inputToLstmSplitedFiltered = tf.matmul(inputToLstmSplited, weights['hidden']) + biases['hidden']

# Get lstm cell out
outputs, states = rnn.rnn(lstm, inputToLstmSplited, initial_state=istate)

Upvotes: 1

Views: 725

Answers (1)

Avishkar Bhoopchand
Avishkar Bhoopchand

Reputation: 929

The second effectively does the same as the loop in the first, returning a list of all the outputs collected in the loop and the final state. It does it a bit more efficiently though and with a number of safety checks. It also supports useful features like variable sequence lengths. The first option is presented in Tensorflow tutorials to give you an idea of how an RNN is unravelled, but the second option is preferred for "production" code.

Upvotes: 2

Related Questions