Reputation: 11
Currently i'm implementing a large custom model and referencing the multi gpu example of CIFAR 10 that comes along with tensorflow. However the code I ended up writing based on that was not clean and was error prone. For e.g. I had to find every trainable variable and add "with tf.device('/cpu:0')".
Are there more efficient/cleaner ways of adapting for multi gpu execution?
Many thanks for any support.
Upvotes: 1
Views: 486
Reputation: 57903
Here's an example from Rafal
You make a loop over towers with the body constructing i
th tower as with tf.device(assign_to_gpu(i))
. The function assign_to_gpu
treats variables differently and assigns them onto "ps-device".
Note: we found that when GPUs are p2p connected, training was faster when variables were kept gpu:0
rather than cpu:0
Upvotes: 2