Reputation: 21
friends! I have a question about processing with multiple gpu. I'm using 4 gpus and tried simple A^n + B^n example in 3 way like below.
Single GPU
with tf.device('/gpu:0'):
....tf.matpow codes...
Multiple GPU
with tf.device('/gpu:0'):
....tf.matpow codes...
with tf.device('/gpu:1'):
....tf.matpow codes...
No specific gpu designated (I think maybe all of gpu used)
....just tf.matpow codes...
when tried this, the result was incomprehensible. the result was 1. single gpu : 6.x seconds 2. multiple gpu(2 gpus) : 2.x seconds 3. no specific gpu designated(maybe 4 gpus) : 4.x seconds
I cannot understand why #2 is faster than #3. Anyone can help me?
Thanks.
Upvotes: 2
Views: 2341
Reputation: 44
In the third code sample (where no GPU was designated) Tensorflow didn't use all of your GPUs. By default if Tensorflow can find a GPU ("/gpu:0") to use it assigns as many calculations as possible to that GPU. You would need to tell it specifically that you want want it to use all 4 like you did in the second code sample.
From the Tensorflow documentation:
If you have more than one GPU in your system, the GPU with the lowest ID will be selected by default. If you would like to run on a different GPU, you will need to specify the preference explicitly:
with tf.device('/gpu:2'):
tf code here
Upvotes: 1
Reputation: 639
While the Tensorflow scheduler works well for single GPUs, it is not as good at optimizing the placement of computations on multiple GPUs yet. (Although it is being worked on presently.) Without further details, it's hard to know exactly what's going on. To get a get a better picture, you can log where the computations are actually being placed by the scheduler. You can do this by setting the log_device_placement
flag on when creating the tf.Session
:
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
Upvotes: 2