Automatic broadcasting in Tensorflow

Question

What is the best way of dynamically broadcasting a 1D vector such that it can perform element-wise multiplication with rows of the supplied tensor?

At the moment, I have the following verbose "solution", which tf.tiles the coef (the 1D vector) across to the shape of the second argument on a case by case basis:

import tensorflow as tf
tf.reset_default_graph()
iterSession = tf.InteractiveSession()

coef = tf.constant([1., 2, 3])  # shape = (3,)

LL_grads = {
    'a': tf.constant([[1.], [2], [3]]),  # shape = (3, 1)
    'b': tf.constant([[1., 2], [3, 4], [5, 6]]),  # shape = (3, 2)
    'c': tf.constant([[[1.], [2]], [[3], [4]], [[5], [6]]])  # shape = (3, 2, 1)
}

avg_grad_stacked = {}
for x in ['a', 'b', 'c']:
    LL_grad = LL_grads[x]

    dim = len(LL_grad.get_shape())

    if dim == 1:
        avg_grad_stacked[x] = LL_grad * coef
    elif dim == 2:
        # Manually broadcast to (3, 2)
        avg_grad_stacked[x] = LL_grad * tf.tile(tf.reshape(coef, (-1, 1)), 
                                                [1, tf.shape(LL_grad)[1]])
    elif dim == 3:
        # Manually broadcast to (3, 2, 1)
        avg_grad_stacked[x] = LL_grad * tf.tile(tf.reshape(coef, (-1, 1, 1)), 
                                                [1, tf.shape(LL_grad)[1], tf.shape(LL_grad)[2]])

I would ideally have liked to have something simple and Pythonic as:

avg_grad_stacked_2 = {x:coef * y for x, y in LL_grads.items()}

However, this fails with the error:

ValueError: Dimensions must be equal, but are 3 and 2 for 'mul_4' (op: 'Mul') with input shapes: [3], [3,2].

So is there an automatic way of broadcasting a vector?

benjaminplanche · Accepted Answer

"Pythonic" answer:

import tensorflow as tf
import string
tf.reset_default_graph()
iterSession = tf.InteractiveSession()

coef = tf.constant([1., 2, 3])  # shape = (3,)

LL_grads = {
    'a': tf.constant([[1.], [2], [3]]),  # shape = (3, 1)
    'b': tf.constant([[1., 2], [3, 4], [5, 6]]),  # shape = (3, 2)
    'c': tf.constant([[[1.], [2]], [[3], [4]], [[5], [6]]])  # shape = (3, 2, 1)
}

avg_grad_stacked = {x: tf.transpose(tf.transpose(LL_grad) * coef) for x, LL_grad in LL_grads.items()}

Explanation:

You only need to do the tiling manually in your case because you are doing the multiplication along the 1st dimension. Tensorflow takes care itself of the broadcasting when multiplying tensors along the last dimension. A solution is thus simply to transpose your tensors before multiplication, then transpose the result back.

Previous answer with tf.einsum():

It may not directly answer your question, as it isn't much more pythonic, and doesn't use tiling, but tf.einsum() is a powerful tool to multiply tensors of different dimensions (among other things).

In your case, it could be used somehow like this:

import tensorflow as tf
import string
tf.reset_default_graph()
iterSession = tf.InteractiveSession()

coef = tf.constant([1., 2, 3])  # shape = (3,)

LL_grads = {
    'a': tf.constant([[1.], [2], [3]]),  # shape = (3, 1)
    'b': tf.constant([[1., 2], [3, 4], [5, 6]]),  # shape = (3, 2)
    'c': tf.constant([[[1.], [2]], [[3], [4]], [[5], [6]]])  # shape = (3, 2, 1)
}

avg_grad_stacked = {}
for x, LL_grad in LL_grads.items():
    dim = len(LL_grad.get_shape())

    coef_axis = string.ascii_lowercase[0]                 # "a"
    LL_grads_axes = "".join(
        [string.ascii_lowercase[i] for i in range(dim)])  # e.g. "abc" for dim==3

    ein_equation = "{0},{1}->{0}".format(
        LL_grads_axes, coef_axis)                         # e.g. "abc,a->abc"
    avg_grad_stacked[x] = tf.einsum(ein_equation, LL_grad, coef)

Automatic broadcasting in Tensorflow

Answers (1)

"Pythonic" answer:

Previous answer with tf.einsum():

Related Questions