TensorFlow: one network, two GPUs?

Question

I have a convolutional neural network with two different output streams:

                         input
                           |
                         (...) <-- several convolutional layers
                           |
                       _________
    (several layers)   |       |    (several layers)
    fully-connected    |       |    fully-connected
    output stream 1 -> |       | <- output stream 2

I would like to compute stream 1 on /gpu:0 and stream 2 on /gpu:1. Unfortunately I were not able to set it up properly.

This attempt:

...placeholders...
...conv layers...

with tf.device("/gpu:0"):
    ...stream 1 layers...
    nn_out_1 = tf.matmul(...)

with tf.device("/gpu:1"):
    ...stream 2 layers...
    nn_out_2 = tf.matmul(...)

Runs dead slow (slower than training on 1 GPU solely) and sometimes produces NaN values in the output. I thought this might be because the with statements may not be synchronized properly. So I added control_dependencies and placed the conv layers on /gpu:0 explicitly:

...placeholders...  # x -> input, y -> labels

with tf.device("/gpu:0"):
    with tf.control_dependencies([x, y]):
        ...conv layers...
        h_conv_flat = tf.reshape(h_conv_last, ...)

with tf.device("/gpu:0"):
    with tf.control_dependencies([h_conv_flat]):
        ...stream 1 layers...
        nn_out_1 = tf.matmul(...)

with tf.device("/gpu:1"):
    with tf.control_dependencies([h_conv_flat]):
        ...stream 2 layers...
        nn_out_2 = tf.matmul(...)

...but with this approach the network isn't even running. No matter what I've tried, it complained about the input not being initialized:

tensorflow.python.framework.errors.InvalidArgumentError:
    You must feed a value for placeholder tensor 'x'
    with dtype float
    [[Node: x = Placeholder[dtype=DT_FLOAT, shape=[],
    _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Without the with statements the network is training on /gpu:0 only and runs fine - trains reasonable stuff, no errors.

What am I doing wrong? Is TensorFlow not able to split different streams of layers in one network to different GPUs? Do I always have to split the complete network in different towers?

TensorFlow: one network, two GPUs?

Answers (1)

Related Questions