Reputation:
UPDATE:
I found the source code of GPUDevice, it hard-coded max streams to 1, may I know the know reason?
GPUDevice(const SessionOptions& options, const string& name, Bytes memory_limit, const DeviceLocality& locality, TfGpuId tf_gpu_id, const string& physical_device_desc, Allocator* gpu_allocator, Allocator* cpu_allocator) : BaseGPUDevice(options, name, memory_limit, locality, tf_gpu_id, physical_device_desc, gpu_allocator, cpu_allocator, false /* sync every op */, 1 / max_streams /) { if (options.config.has_gpu_options()) { force_gpu_compatible_ = options.config.gpu_options().force_gpu_compatible(); }
======================================
I am wondering whether TensorFlow(1.x version) supports multi-thread or multi-stream on a single GPU. If not, I am curious the underlying reasons, TF did this on some purposes or some libs like CUDA prevents TF from providing or some other reasons?
Like some previous posts[1,2], I tried to run multiple training ops in TF, i.e. sees.run([train_op1, train_op2],feed_dict={...}), I used the TF timeline to profile each iteration. However, TF timeline always showed that two train ops run sequentially (although timeline is not accurate[3], the wall time of each op suggests sequential running). I also looked at some source code of TF, it looks like the each op are computed by in device->ComputeAsync() or device->Compute(), and the GPU is blocked when computing an op. If I am correct, one GPU can only run a single op each time, which may lower GPU utilization.
1.Running multiple tensorflow sessions concurrently
2.Run parallel op with different inputs and same placeholder
3.https://github.com/tensorflow/tensorflow/issues/1824#issuecomment-244251867
Upvotes: 0
Views: 1668
Reputation: 1
I have similar experience with you. I have two GPU, each GPU run three threads, each thread running a session, each session running time fluct a lot. if run only one thread on each GPU, session running time is quite stable.
from these appearence, we can conclude that ,thread in tensorflow not cowork well, the mechanism of tensorflow has problem.
Upvotes: 0