Reputation: 574
I am learning LSTM based seq2seq model in Tensorflow platform. I can very well train a model on a given simple seq2seq examples.
However, in cases where I have to learn two sequences at once from a given sequence (for e.g: learning previous sequence and next sequence from the current sequence simultaneously), how can we do it i.e, compute the combined error from both sequence and backpropogate the same error to both sequences?
Here's the snippet to the LSTM code that I am using (mostly taken from ptb example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/rnn/ptb/ptb_word_lm.py#L132):
output = tf.reshape(tf.concat(1, outputs), [-1, size])
softmax_w = tf.get_variable("softmax_w", [size, word_vocab_size])
softmax_b = tf.get_variable("softmax_b", [word_vocab_size])
logits = tf.matmul(output, softmax_w) + softmax_b
loss = tf.nn.seq2seq.sequence_loss_by_example(
[logits],
[tf.reshape(self._targets, [-1])],
[weights])
self._cost = cost = tf.reduce_sum(loss) / batch_size
self._final_state = state
self._lr = tf.Variable(0.0, trainable=False)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),config.max_grad_norm)
optimizer = tf.train.GradientDescentOptimizer(self.lr)
self._train_op = optimizer.apply_gradients(zip(grads, tvars))
Upvotes: 1
Views: 508
Reputation: 2196
It seems to me that you want to have a single encoder and multiple decoders (e.g. 2, for 2 output sequences), right? There is one2many in seq2seq for exactly this use-case.
As for loss, I think you can just add the losses from both sequences. Or do you want to weight them somehow? I think it's a good idea to just add them, and then compute gradients and everything else as if the added losses were the only loss.
Upvotes: 0