Reputation: 1880
Trying to do a sanity check on the size of the queue created by tf.train.string_input_producer
(r1.3) but just get zero size when expected to get size of three
import tensorflow as tf
file_queue = tf.train.string_input_producer(['file1.csv','file2.csv','file3.csv'])
print('type(file_queue): {}'.format(type(file_queue)))
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
file_queue_size = sess.run(file_queue.size())
coord.request_stop()
coord.join(threads)
print('result of queue size operation: {}'.format(file_queue_size))
My hunch was that there was some sort of lazy initialization going on so I figured I would ask the queue for an item and see what the size was after
import tensorflow as tf
file_queue = tf.train.string_input_producer(['file1.csv','file2.csv','file3.csv'])
print('type(file_queue): {}'.format(type(file_queue)))
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
item = sess.run(file_queue.dequeue())
file_queue_size = sess.run(file_queue.size())
coord.request_stop()
coord.join(threads)
print('result of queue size operation: {}'.format(file_queue_size))
While the size is no longer zero, the size provided is not two and changes every time the code is run.
I feel like getting the size is a simple thing but maybe this is just not the way to interact with data_flow_ops.FIFOQueue
. Any insight to explain what is going on here would be much appreciated.
Upvotes: 0
Views: 144
Reputation: 479
This is an artifact of the async nature in which queues are filled up by the queue runners. Try this code:
import tensorflow as tf
import time
file_queue = tf.train.string_input_producer(['file1.csv','file2.csv','file3.csv'])
print('type(file_queue): {}'.format(type(file_queue)))
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
time.sleep(1)
file_queue_size = sess.run(file_queue.size())
coord.request_stop()
coord.join(threads)
print('result of queue size operation: {}'.format(file_queue_size))
The output should say result of queue size operation: 32
. By pausing the main thread, the queue runners could run for long enough to fill up the queue.
Why 32
and not 3
? Let's look at the signature for string_input_producer
:
string_input_producer(
string_tensor,
num_epochs=None,
shuffle=True,
seed=None,
capacity=32,
shared_name=None,
name=None,
cancel_op=None
)
Two main things to note here:
num_epochs=None
means keep iterating on the list of items forever. To iterate over the list just once set num_epochs=1
. You will also need to call sess.run(tf.local_variables_initializer())
. With num_epochs=1
if you run the size op after dequeues you will see the size reducing.capacity=32
this is the default capacity of the queue. Which is why we are seeing 32
instead of 3
in the output above.Upvotes: 1