Reputation: 2356
I have two questions:
(1) How does Tensorflow allocate GPU memory when using only one GPU? I have an implementation of convolution 2d like this (globally using GPU):
def _conv(self, name, x, filter_size, in_filters, out_filters, strides):
with tf.variable_scope(name):
n = filter_size * filter_size * out_filters
kernel = tf.get_variable(
'', [filter_size, filter_size, in_filters, out_filters], tf.float32,
initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / n)),
)
return tf.nn.conv2d(x, kernel, strides, padding='SAME')
# another option
# x = tf.nn.conv2d(x, kernel, strides, padding='SAME')
# return x
The another option in the comments does the same operation but have added a new variable x
. In this case, will TF allocate more GPU memory?
(2) when using multiple GPUs. I'd like to use list
for gathering the results from multiple GPUs. The implementation is below:
def _conv(self, name, input, filter_size, in_filters, out_filters, strides, trainable=True):
assert type(input) is list
assert len(input) == FLAGS.gpu_num
n = filter_size * filter_size * out_filters
output = []
for i in range(len(input)):
with tf.device('/gpu:%d' % i):
with tf.variable_scope(name, reuse=i > 0):
kernel = tf.get_variable(
'', [filter_size, filter_size, in_filters, out_filters], tf.float32,
initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / n))
)
output.append(tf.nn.conv2d(input[i], kernel, strides, padding='SAME'))
return output
Will TF allocate more memory because of the usage of list
? Is output
(the list
) attached to some GPU device? I have these kinds of questions because when I am using two GPUs to train the CNNs with this implementation, the program uses much more GPU memory than when using one GPU. I think there is something I missed or misunderstood.
Upvotes: 1
Views: 1162
Reputation: 2356
Using this code to check each tensor and the attached device.
for n in tf.get_default_graph().as_graph_def().node:
print n.name, n.device
So the answers for these two questions:
(1) No.
(2) If I'd like to gather the immediate data across GPUs, and the data are considered to compute the gradients, there would be problems. Because computing gradients consumes memory too. When accessing data across GPUs, additional memory will be allocated.
Upvotes: 0