fac120
fac120

Reputation: 25

Trouble Understanding broadcasting behavior for tensors

I am trying to do element-wise multiplication of two tensors of dimensions (1,5,64) and (1,5). As far as I know, in spite of their dimension mismatch, broadcasting should allow this to work. So, I use this code:

x = tf.range(0,64*5)
x = tf.reshape(x, [1,5, 64])

y = tf.range(0,5)
y = tf.reshape(y, [1, 5])

prodct = x*y

This causes this error:

InvalidArgumentError: Incompatible shapes: [1,5,64] vs. [1,5] [Op:Mul]

However If i reshape first tensor to dimension (1,64,5), then it works. Code:

x = tf.range(0,64*5)
x = tf.reshape(x, [1,64, 5])

y = tf.range(0,5)
y = tf.reshape(y, [1, 5])

prodct = x*y

I do not understand why the first code does not work.

Upvotes: 2

Views: 182

Answers (1)

Innat
Innat

Reputation: 17219

The General Broadcasting Rules, when operating on two arrays, compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when

  • they are equal, or
  • one of them is 1

If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the size that is not 1 along each axis of the inputs.

also follows the same spirit. Check the documentation for more examples and details. For your case, the rightmost dimension doesn't follow the rules and throws an error.

1, 5, 64
   1, 5

But this would work as it obeys the rules.

1, 64, 5
   1,  5

Code

In , and in for reference.

import numpy as np 
a = np.arange(64*5).reshape(1, 64, 5)
b = np.arange(5).reshape(1,5)
(a*b).shape
(1, 64, 5)

import tensorflow as tf 
x = tf.reshape(tf.range(0,64*5), [1, 64, 5])
y = tf.reshape(tf.range(0,5), [1, 5])
(x*y).shape
TensorShape([1, 64, 5])

Upvotes: 1

Related Questions