Reputation: 163
I'm trying to understand what this argument does, the TF api docs for AdamOptimizer's compute_gradients
method say the following -
colocate_gradients_with_ops
: If True, try colocating gradients with the corresponding op.
but it is not clear to me. What does colocating gradients mean in this context and what is the said op?
Upvotes: 4
Views: 3813
Reputation: 1705
colocate_gradients_with_ops=False (default): compute gradients on a single GPU (most likely /gpu:0)
colocate_gradients_with_ops=True: compute gradients on multiple GPUs in parallel.
In more detail:
colocate_gradients_with_ops=False (default)
Only computes predictions in parallel and then move the computed predictions to the parameter server(cpu:0 or gpu:0). And with predictions, the parameter server (single device) computes gradients for the whole batch (gpu_count * batch_size).
colocate_gradients_with_ops=True
Place the computation of gradients on the same device as the original (forward-pass) op. Namely, compute gradients together with predictions on each particular GPU and then move the gradients to the parameter server to average them to get the overall gradients.
Upvotes: 11