Kuba
Kuba

Reputation: 3056

Difference in matrix multiplication tensorflow vs numpy

I have a case where matrix multiplication of two matrices with certain dimensions work in numpy, but doesn't work in tensorflow.

x = np.ndarray(shape=(10,20,30), dtype = float)
y = np.ndarray(shape=(30,40), dtype = float)
z = np.matmul(x,y)
print("np shapes: %s x %s = %s" % (np.shape(x), np.shape(y), np.shape(z)))

This works as expected and prints:

np shapes: (10, 20, 30) x (30, 40) = (10, 20, 40)

However in tensorflow when I try to multiply placeholder and variable of the same shapes as the numpy arrays above I get an error

x = tf.placeholder(tf.float32, shape=(10,20,30))
y = tf.Variable(tf.truncated_normal([30,40], name='w'))
print("tf shapes: %s x %s" % (x.get_shape(), y.get_shape()))
tf.matmul(x,y)

Results in

tf shapes: (10, 20, 30) x (30, 40)
InvalidArgumentError: 
Shape must be rank 2 but is rank 3 for 'MatMul_12' 
(op: 'MatMul') with input shapes: [10,20,30], [30,40].

Why does this operation fail?

Upvotes: 7

Views: 3903

Answers (3)

Dmytro Danevskyi
Dmytro Danevskyi

Reputation: 3159

Don't know why tf.matmul does not support this kind of multiplication (may be one of the core developers could provide a meaningful answer).

But if you just want to be able to multiply tensors in this way, take a look at tf.einsum function. It could operate with tensors of arbitrary rank.

Upvotes: 2

Salvador Dali
Salvador Dali

Reputation: 222441

People already told you that you can use tf.einsum() to get the result you want.

import tensorflow as tf
x = tf.random_normal([10, 20, 30])
y = tf.random_normal([30, 40])
z = tf.einsum('ijk,kl->ijl', x, y)

The reason why tf.matmul() does not work the way you expected is written in the documentation.

The inputs must be matrices (or tensors of rank > 2, representing batches of matrices), with matching inner dimensions, possibly after transposition.

In your case you have a matrix y and a tensor x (rank 3 > 2). In your case inner dimensions do not match. If you want, them to match, you will need to have something like this:

import tensorflow as tf
a, b, c = 12, 50, 20
x = tf.random_normal([a, b, c])
y = tf.random_normal([a, c, b])
z = tf.matmul(x, y)

But clearly it calculates not the stuff you want.

Upvotes: 0

Kuba
Kuba

Reputation: 3056

As suggested by Dmytro tf.einsum can be used to multiply these two arrays.

x = np.ndarray(shape=(10,20,30), dtype = float)
y = np.ndarray(shape=(30,40), dtype = float)

These two operations produce exactly the same result:

np.einsum('ijk,kl->ijl', x, y)
np.matmul(x,y)

And corresponding tensorflow operation also works

tf.einsum('ijk,kl->ijl', tf_x,tf_y)

Upvotes: 0

Related Questions