Reputation: 10888
Let's pretend that I launch the following commands in parallel to train many TensorFlow models at once on the same machine:
python3 launch_training.py --gpu 0
python3 launch_training.py --gpu 1
python3 launch_training.py --gpu 2
python3 launch_training.py --gpu 3
python3 launch_training.py --gpu 4
python3 launch_training.py --gpu 5
python3 launch_training.py --gpu 6
python3 launch_training.py --gpu 7
Let's pretend that inside launch_training.py
, a TensorFlow graph and session are created, and with the following context: with tf.device('/gpu:0'):
, and where the 0
is replaced by the proper --gpu
index argument).
Will this work? If not, which steps would I have to take to make this work? I'd like to know this before renting GPUs.
Upvotes: 2
Views: 855
Reputation: 1576
You have to specify a gpu device with with tf.device('gpu:N')
where N
is the device index. Read https://www.tensorflow.org/programmers_guide/using_gpu and https://github.com/carla-simulator/carla/issues/116 first
I think you've confused running the same script multiple times on different GPUs and running one script using multiple GPUs. In the former case, read the "Using a single GPU on a multi-GPU system" section of the TensorFlow guide, for the latter "Using multiple GPUs".
Upvotes: 1