Reputation: 71
I'm referring to the example codes in
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
and
https://indico.io/blog/tensorflow-data-inputs-part1-placeholders-protobufs-queues/
for standard approaches in feeding the data to the language models (rnn
) using TensorFlow.
So before feeding the input, I'm making sure they are padded to the max input length in the current batch and then randomly shuffled within the current batch. so far good.
However, I have difficulty in specifying the initial state for tf.nn.dynamic_rnn
whose shape depends on the current batch_size
.
If the batch size is fixed there is should not be any problem.
However, if I'm using tf.PaddingFIFOQueue.dequeue_up_to(batch_size)
, it may be possible to return less than the batch_size
if not many elements in the queue. (which is possible if dequeuing the last set of elements).
In this case how do we specify the initial state with the exact batch size returned.
Upvotes: 1
Views: 135
Reputation: 6328
You can use the None
dimension in your Tensors to specify TensorFlow that the batch dimension can be different from one run to another.
You might want to read this faq on tensor shapes to get a better sense, and in particular this section (formatting is mine):
How do I build a graph that works with variable batch sizes?
It is often useful to build a graph that works with variable batch sizes, for example so that the same code can be used for (mini-)batch training, and single-instance inference. The resulting graph can be saved as a protocol buffer and imported into another program.
When building a variable-size graph, the most important thing to remember is not to encode the batch size as a Python constant, but instead to use a symbolic Tensor to represent it. The following tips may be useful:
Use
batch_size = tf.shape(input)[0]
to extract the batch dimension from a Tensor calledinput
, and store it in a Tensor calledbatch_size
.Use
tf.reduce_mean
instead oftf.reduce_sum(...) / batch_size
.If you use placeholders for feeding input, you can specify a variable batch dimension by creating the placeholder with
tf.placeholder(..., shape=[None, ...])
. TheNone
element of the shape corresponds to a variable-sized dimension.
Upvotes: 1