Reputation: 379
Before the new Dataset API of TensorFlow 1.4 I was using the following code to create a shared queue of filenames between different workers:
# queue with the file names that can be shared amongst workers during training
filename_queue = tf.FIFOQueue(100, tf.string, shared_name=shared_name)
enque_op = filename_queue.enqueue_many([tf.train.limit_epochs(file_names, num_epochs)])
close_op = filename_queue.close(cancel_pending_enqueues=True)
# create queue runner and add it to queue runners
qr = tf.train.QueueRunner(filename_queue, [enque_op], close_op,
queue_closed_exception_types=(tf.errors.OutOfRangeError, tf.errors.CancelledError))
tf.train.add_queue_runner(qr)
# read example from file
reader = tf.TFRecordReader()
_, example = reader.read(filename_queue)
# parse example
image, ground_truth, example_name = parse_example(example)
This code uses queues and queue runners and it's quite ugly and confusing. But it allowed the option shared_name=
to create a shared queue between workers so they wouldn't work on the same examples.
After the new release of TensorFlow 1.4 input pipelines have become much more easy to use. So I want to update my program to use this new feature. However I can't find anywhere in the new documentation how to have a shared Dataset between workers.
Is this done automatically or is not a feature?
Upvotes: 1
Views: 919
Reputation: 315
You can use tf.data.Dataset.shard
(see documentation) for this purpose. The documentation illustrates how to "shard" elements of a single file or (like in your example) "shard" the filenames.
Upvotes: 1