LingxB
LingxB

Reputation: 497

How to matmul a 2d tensor with a 3d tensor in tensorflow?

In numpy you can multiply a 2d array with a 3d array as below example:

>>> X = np.random.randn(3,5,4) # [3,5,4]
... W = np.random.randn(5,5) # [5,5]
... out = np.matmul(W, X) # [3,5,4]

from my understanding, np.matmul() takes W and broadcast it along the first dimension of X. But in tensorflow it is not allowed:

>>> _X = tf.constant(X)
... _W = tf.constant(W)
... _out = tf.matmul(_W, _X)

ValueError: Shape must be rank 2 but is rank 3 for 'MatMul_1' (op: 'MatMul') with input shapes: [5,5], [3,5,4].

So is there a equivalent for what np.matmul() does above in tensorflow? And what's the best practice in tensorflow for multiplying 2d tensor with 3d tensor?

Upvotes: 3

Views: 6599

Answers (5)

Souradeep Nanda
Souradeep Nanda

Reputation: 3288

Try using tf.tile to match the dimension of the matrix before multiplication. The automatic broadcast feature of numpy doesnt seem to be implemented in tensorflow. You have to do it manually.

W_T = tf.tile(tf.expand_dims(W,0),[3,1,1])

This should do the trick

import numpy as np
import tensorflow as tf

X = np.random.randn(3,4,5)
W = np.random.randn(5,5)

_X = tf.constant(X)
_W = tf.constant(W)
_W_t = tf.tile(tf.expand_dims(_W,0),[3,1,1])

with tf.Session() as sess:
    print(sess.run(tf.matmul(_X,_W_t)))

Upvotes: 5

CKM
CKM

Reputation: 1971

Here I'll use keras backend K.dot and tensorflow tf.transpose. first swap inner dim of 3 D tensor

X=tf.transpose(X,perm=[0,-1,1]) # X shape=[3,4,5]

now multiply like so

out=K.dot(X,W) # out shape=[3,4,5]

and now swap axes again

out = tf.transpose(out,perm=[0,-1,1]) # out shape=[3,5,4]

Above solution saves memory at little cost of time because you are not tiling W.

Upvotes: 0

digitaldingo
digitaldingo

Reputation: 519

You can also use tf.einsum to avoid tiling the tensor:

tf.einsum("ab,ibc->iac", _W, _X)

A full example:

import numpy as np
import tensorflow as tf

# Numpy-style matrix multiplication:
X = np.random.randn(3,5,4)
W = np.random.randn(5,5)
np_WX = np.matmul(W, X)

# TensorFlow-style multiplication:
_X = tf.constant(X)
_W = tf.constant(W)
_WX = tf.einsum("ab,ibc->iac", _W, _X)

with tf.Session() as sess:
    tf_WX = sess.run(_WX)

# Check that the results are the same:
print(np.allclose(np_WX, tf_WX))

Upvotes: 2

Vijay Mariappan
Vijay Mariappan

Reputation: 17191

You can use tensordot instead:

tf.transpose(tf.tensordot(_W, _X, axes=[[1],[1]]),[1,0,2])

Upvotes: 5

coder3101
coder3101

Reputation: 4165

Following is from tensorflow XLA broadcasting semantics

The XLA language is as strict and explicit as possible, avoiding implicit and "magical" features. Such features may make some computations slightly easier to define, at the cost of more assumptions baked into user code that will be difficult to change in the long term.

So Tensorflow doesn't offers built in broadcasting feature.

However it does offer something that can reshape a tensor just like it was broadcasted. This operation is called tf.tile

Signature is as follow :

tf.tile(input, multiples, name=None)

This operation creates a new tensor by replicating input multiples times. The output tensor's i'th dimension has input.dims(i) * multiples[i] elements, and the values of input are replicated multiples[i] times along the 'i'th dimension.

Upvotes: 3

Related Questions