Reputation: 85
I'm using Tensorflow for learning MNIST data. For batching I create a batch from single images like this:
BatchedInputs = list(tf.train.batch(
Inputs,
batch_size=BatchSize,
num_threads=self._PreprocessThreads,
capacity=self._MinimumSamplesInQueue + 3 * BatchSize))
When I create (for testing) batches of size 1 and look at those images in TensorBoard, I can see that not in every run, every image is the same like in other runs. They are not directly shuffled, but sometimes another image is contained.
I would expect to get a deterministic output from that operation, but this is not the case. Maybe I'm doing anything wrong (starting queues wrong or something like that)?
Upvotes: 2
Views: 589
Reputation: 126154
If you set num_threads > 1
when calling tf.train.batch()
, the resulting program will be non-deterministic, because this will create three uncoordinated prefetching threads that evaluate Input
and insert the next element into the queue. Since the prefetching threads are uncoordinated, there is a race between these threads to enqueue elements in the queue, and this leads to the non-determinism in the order of queue elements.
Setting num_threads = 1
should make this part of your program deterministic, assuming that the other parts of your program are deterministic. However, this is a weak guarantee, and—in particular—any use of shuffling in the queue-based input routines will make the program non-deterministic.
Upvotes: 6