betelgeuse
betelgeuse

Reputation: 1266

Do we use different weights in Bidirectional LSTM for each batch?

For example this is one of the function which we need to call for each batch. Here it looks like different parameters are used for each batch. Is that correct? If it is then, why? Shouldn't we be using same parameters for whole training set?

def bidirectional_lstm(input_data, num_layers=3, rnn_size=200, keep_prob=0.6):

    output = input_data
    for layer in range(num_layers):
        with tf.variable_scope('encoder_{}'.format(layer)):

            cell_fw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
            cell_fw = tf.contrib.rnn.DropoutWrapper(cell_fw, input_keep_prob = keep_prob)

            cell_bw = tf.contrib.rnn.LSTMCell(rnn_size, initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
            cell_bw = tf.contrib.rnn.DropoutWrapper(cell_bw, input_keep_prob = keep_prob)

            outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_fw, 
                                                              cell_bw, 
                                                              output,
                                                              dtype=tf.float32)
            output = tf.concat(outputs,2)

    return output

for batch_i, batch in enumerate(get_batches(X_train, batch_size)):
    embeddings = tf.nn.embedding_lookup(word_embedding_matrix, batch)
    output = bidirectional_lstm(embeddings)
    print(output.shape)

Upvotes: 2

Views: 353

Answers (1)

betelgeuse
betelgeuse

Reputation: 1266

I have figured out the issue in there. It turns out that we do use the same parameter and above code will give an error in second iteration saying that bidirectional kernel already exists. To fix this, we need to set, reuse=AUTO_REUSE while defining scope variable. Therefore, the line

with tf.variable_scope('encoder_{}'.format(layer)):

will become

with tf.variable_scope('encoder_{}'.format(layer),reuse=AUTO_REUSE):

Now we are using the same layers for each batch.

Upvotes: 1

Related Questions