Understanding tensorflow queues and cpu gpu transfer

Question

After reading this github issue I feel like I'm missing something in my understanding on queues:

https://github.com/tensorflow/tensorflow/issues/3009

I thought that when loading data into a queue, it will get pre-transferred to the GPU while the last batch is getting computed, so that there is virtually no bandwidth bottleneck, assuming computation takes longer than the time to load the next batch.

But the above link suggests that there is an expensive copy from queue into the graph (numpy <-> TF) and that it would be faster to load the files into the graph and do preprocessing there instead. But that doesn't make sense to me. Why does it matter if I load a 256x256 image from file vs a raw numpy array? If anything, I would think that the numpy version is faster. What am I missing?

Yaroslav Bulatov · Accepted Answer

There's no implementation of GPU queue, so it only loads stuff into main memory and there's no asynchronous prefetching into GPU. You could make something like a GPU-based queue using variables pinned to gpu:0

Understanding tensorflow queues and cpu <-> gpu transfer

Answers (2)

Related Questions

Understanding tensorflow queues and cpu &lt;-&gt; gpu transfer

Answers (2)

Related Questions

Understanding tensorflow queues and cpu <-> gpu transfer