dereks
dereks

Reputation: 564

Why doesn't `tf.matmul` work with transposed tensor?

Why doesn't tf.matmul work with transposed tensor?

transpose_b=True is ok, but not tf.transpose(inp).

This screenshot was made in Colab with tensorflow-gpu==2.0.0-rc1:

enter image description here

Upvotes: 2

Views: 819

Answers (3)

thomi
thomi

Reputation: 1697

What Tensorflow is telling you is that the dimensions do not match up when multiplying the two tensors together. Think of it in basic linear algebra terms. When multiplying matrices, you can only multiply together matrices, where the last dimension of the first matrix is the same as the first dimension of the second. E.g. you can multiply a 2x4 Matrix with a 4x2 matrix (which is what transpose does for you. From the docs:

If perm is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors.

so if you omit perm in higher dimensions, tf.transform() switches dimensions just like it would for 2d tensors (matrices):

inp_t_without_perm = tf.transpose(inp)
inp_t_without_perm
# Output: <tf.Tensor 'transpose_8:0' shape=(1, 4, 2) dtype=float32>

so it just switches the last dimension for the first and leaves the second one unaltered. This is equivalent to:

inp_t_with_wrong_perm = tf.transpose(inp, perm=[2,1,0])
inp_t_with_wrong_perm
# Output: <tf.Tensor 'transpose_8:0' shape=(1, 4, 2) dtype=float32>

if you then do:

mul = tf.matmul(inp, inp_t_without_perm) # or with inp_t_with_wrong_perm

you get this error, because either your first two or last two dimensions do not match up.

Now, when multiplying higher-order tensors together, you have to align the dimension which differ in the same way you would do in 2d (think about it as dividing up your tensor into matrices and vectors. In your case, you have a vector and a matrix... Sorry, I did not come up yet with a better metaphor, and when I find a quiet half-hour with pen and paper, I could make it more formal by using Einstein notation, but this is basically how it works...).

For your case, what works is:

inp = tf.reshape(tf.linspace(-1.0, 1.0, 8), (2,4,1))
# switch the last two dimensions so you can multiply 4x1 by 1x4
# and leave first dimension as it is.
inp_t = tf.transpose(inp, perm=[0,2,1])
mul = tf.matmul(inp, inp_t)
mul
# Output: <tf.Tensor 'MatMul_8:0' shape=(2, 4, 4) dtype=float32>

Note that in your case, this is the only permutation which works, since this kind of multiplication is non-commutative. So you will have to match up dimensions from left to right (again, sorry for the hand-waving, but a formal mathematical proof would require me to do some higher-order tensor algebra, but I think this is precisely what you want to achieve...). I did not go too deep into the documentation, but I think the transform_b parameter is doing precicely this permutation for you. Hope that helps. Please comment for further questions.

Upvotes: 1

stephen_mugisha
stephen_mugisha

Reputation: 897

tf.transpose() performs regular 2-D matrix transpose by default (it sets the perm parameter to input_tensor_rank-1) if you don't explicitly specify the perm(permutation) parameter. So set the perm parameter appropriately

inp_t = tf.transpose(inp, perm=[0,2,1])
y = tf.matmul(inp, x)
print(y)

Upvotes: 1

javidcf
javidcf

Reputation: 59731

transpose_b=True in tf.linalg.matmul transposes only the two last axes of the second given tensor, while tf.transpose, without more arguments, reverses the dimensions completely. The equivalent would be:

inp_t = tf.transpose(inp, (0, 2, 1))
tf.matmul(inp, inp_t)

Upvotes: 1

Related Questions