Do we use different weights in Bidirectional LSTM for each batch?

Question

For example this is one of the function which we need to call for each batch. Here it looks like different parameters are used for each batch. Is that correct? If it is then, why? Shouldn't we be using same parameters for whole training set?

def bidirectional_lstm(input_data, num_layers=3, rnn_size=200, keep_prob=0.6):

    output = input_data
    for layer in range(num_layers):
        with tf.variable_scope('encoder_{}'.format(layer)):

            cell_fw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
            cell_fw = tf.contrib.rnn.DropoutWrapper(cell_fw, input_keep_prob = keep_prob)

            cell_bw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
            cell_bw = tf.contrib.rnn.DropoutWrapper(cell_bw, input_keep_prob = keep_prob)

            outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_fw, 
                                                              cell_bw, 
                                                              output,
                                                              dtype=tf.float32)
            output = tf.concat(outputs,2)

    return output

for batch_i, batch in enumerate(get_batches(X_train, batch_size)):
    embeddings = tf.nn.embedding_lookup(word_embedding_matrix, batch)
    output = bidirectional_lstm(embeddings)
    print(output.shape)

betelgeuse · Accepted Answer

I have figured out the issue in there. It turns out that we do use the same parameter and above code will give an error in second iteration saying that bidirectional kernel already exists. To fix this, we need to set, reuse=AUTO_REUSE while defining scope variable. Therefore, the line

with tf.variable_scope('encoder_{}'.format(layer)):

will become

with tf.variable_scope('encoder_{}'.format(layer),reuse=AUTO_REUSE):

Now we are using the same layers for each batch.

Do we use different weights in Bidirectional LSTM for each batch?

Answers (1)

Related Questions