Reputation: 497
In numpy
you can multiply a 2d array with a 3d array as below example:
>>> X = np.random.randn(3,5,4) # [3,5,4]
... W = np.random.randn(5,5) # [5,5]
... out = np.matmul(W, X) # [3,5,4]
from my understanding, np.matmul()
takes W
and broadcast it along the first dimension of X
. But in tensorflow
it is not allowed:
>>> _X = tf.constant(X)
... _W = tf.constant(W)
... _out = tf.matmul(_W, _X)
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul_1' (op: 'MatMul') with input shapes: [5,5], [3,5,4].
So is there a equivalent for what np.matmul()
does above in tensorflow
? And what's the best practice in tensorflow
for multiplying 2d tensor with 3d tensor?
Upvotes: 3
Views: 6599
Reputation: 3288
Try using tf.tile to match the dimension of the matrix before multiplication. The automatic broadcast feature of numpy doesnt seem to be implemented in tensorflow. You have to do it manually.
W_T = tf.tile(tf.expand_dims(W,0),[3,1,1])
This should do the trick
import numpy as np
import tensorflow as tf
X = np.random.randn(3,4,5)
W = np.random.randn(5,5)
_X = tf.constant(X)
_W = tf.constant(W)
_W_t = tf.tile(tf.expand_dims(_W,0),[3,1,1])
with tf.Session() as sess:
print(sess.run(tf.matmul(_X,_W_t)))
Upvotes: 5
Reputation: 1971
Here I'll use keras backend K.dot
and tensorflow tf.transpose
.
first swap inner dim of 3 D tensor
X=tf.transpose(X,perm=[0,-1,1]) # X shape=[3,4,5]
now multiply like so
out=K.dot(X,W) # out shape=[3,4,5]
and now swap axes again
out = tf.transpose(out,perm=[0,-1,1]) # out shape=[3,5,4]
Above solution saves memory at little cost of time because you are not tiling W
.
Upvotes: 0
Reputation: 519
You can also use tf.einsum
to avoid tiling the tensor:
tf.einsum("ab,ibc->iac", _W, _X)
A full example:
import numpy as np
import tensorflow as tf
# Numpy-style matrix multiplication:
X = np.random.randn(3,5,4)
W = np.random.randn(5,5)
np_WX = np.matmul(W, X)
# TensorFlow-style multiplication:
_X = tf.constant(X)
_W = tf.constant(W)
_WX = tf.einsum("ab,ibc->iac", _W, _X)
with tf.Session() as sess:
tf_WX = sess.run(_WX)
# Check that the results are the same:
print(np.allclose(np_WX, tf_WX))
Upvotes: 2
Reputation: 17191
You can use tensordot
instead:
tf.transpose(tf.tensordot(_W, _X, axes=[[1],[1]]),[1,0,2])
Upvotes: 5
Reputation: 4165
Following is from tensorflow XLA broadcasting semantics
The XLA language is as strict and explicit as possible, avoiding implicit and "magical" features. Such features may make some computations slightly easier to define, at the cost of more assumptions baked into user code that will be difficult to change in the long term.
So Tensorflow doesn't offers built in broadcasting feature.
However it does offer something that can reshape a tensor just like it was broadcasted. This operation is called tf.tile
Signature is as follow :
tf.tile(input, multiples, name=None)
This operation creates a new tensor by replicating input multiples times. The output tensor's i'th dimension has input.dims(i) * multiples[i] elements, and the values of input are replicated multiples[i] times along the 'i'th dimension.
Upvotes: 3