Reputation: 13
I want to make weight tensors for tf.nn.seq2seq.sequence_loss_by_example. I'm using RNN-LSTM with maximum 100 steps, and zero-padded each batch items by maximum steps (100).
Shapes of my logits and labels are like this.
Tensor("dropout/mul_1:0", shape=(50000, 168), dtype=float32) # logits
Tensor("ArgMax:0", shape=(500, 100), dtype=int64) # labels
50000 is for 500(batch_size) * 100(num_steps), and 168 is number of classes, and I'm passing them to sequence_loss_by_example like the ptb_word_lm.py code provided by Tensorflow. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/rnn/ptb/ptb_word_lm.py
loss = tf.nn.seq2seq.sequence_loss_by_example(
[logits],
[tf.reshape(labels, [-1])],
[tf.ones([cf.batch_size * cf.max_time_steps], dtype=tf.float32)])
However, because my logits and labels are zero-padded, the losses are incorrect. From this answer, https://stackoverflow.com/a/38502547/3974129, I tried to change tf.ones([..]) part to weight tensors, but their underlying conditions are too different from mine.
I have the step length information like below, and I feed them when training.
self._x_len = tf.placeholder(tf.int64, shape=[self._batch_size])
For example, I feed length information [3, 10, 2, 3, 1] for batch of size 5. They are also used for sequence_length in tf.nn.rnn().
One way I can think of is iterating the x_len, and use each item as index of last 1 in each weight.
[0 0 0 0 0 .... 0 0 0] => [1 1 1 ... 1 0 0 0 0]
weight tensor with size of 100 (maximum time step)
But as you know, I cannot use the values inside the tensor for index, because they are not fed yet.
How can I make weight tensors like this?
Upvotes: 0
Views: 2441
Reputation: 13
Using tensorflow creating mask of varied lengths and https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/dynamic_rnn.py, problem solved. I can make indexes of each max steps, and masked them
Upvotes: 0