How to modify the Tensorflow Sequence2Sequence model to implement Bidirectional LSTM rather than Unidirectional one?

Question

Refer to this post to know the background of the problem: Does the TensorFlow embedding_attention_seq2seq method implement a bidirectional RNN Encoder by default?

I am working on the same model, and want to replace the unidirectional LSTM layer with a Bidirectional layer. I realize I have to use static_bidirectional_rnn instead of static_rnn, but I am getting an error due to some mismatch in the tensor shape.

I replaced the following line:

encoder_outputs, encoder_state = core_rnn.static_rnn(encoder_cell, encoder_inputs, dtype=dtype)

with the line below:

encoder_outputs, encoder_state_fw, encoder_state_bw = core_rnn.static_bidirectional_rnn(encoder_cell, encoder_cell, encoder_inputs, dtype=dtype)

That gives me the following error:

InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5,1,256] vs. [16,1,1,256] [[Node: gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/Shape, gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/Shape_1)]]

I understand that the outputs of both the methods are different, but I do not know how to modify attention code to incorporate that. How do I send both the forward and backward states to the attention module- do I concatenate both the hidden states?

How to modify the Tensorflow Sequence2Sequence model to implement Bidirectional LSTM rather than Unidirectional one?

Answers (1)

Related Questions