bj1123
bj1123

Reputation: 199

obstacles in tensorflow's tensordot using batch multiplication

I'm implementing RBM in tensorflow.

and there is an obstacle in implementing parameters update using mini-batch

there are 2 tensors

1st tensor's shape is [100,3,1] 2nd tensor's shape is [100,1,4]

number 100 is size of batch.

so i want to multiply these tensor which results in [100,3,4] tensor.

but when i implement code like

tf.tensordot(1st_tensor,2nd_tensor,[[2],[1]])

resulting tensor' shape is [100,3,100,4]

how do i solve this problem?

Upvotes: 4

Views: 1752

Answers (2)

The AI Architect
The AI Architect

Reputation: 1917

You can use tf.keras.backend.batch_dot instead; it expects the first dimension to be batch_size, and should do what you want it to do.

Upvotes: 1

ptsw
ptsw

Reputation: 195

I'm not sure if you're still facing this issue (as it's been a month) but I resolved the same issue by using tf.tensordot and tf.map_fn, which accepts nested input elements and parallelizes a function across the first (usually, batch) dimension. The following function performs a batch-parallel matrix multiplication across the final two dimensions of your tensors of arbitrary rank (as long as the last two axes match for the purposes of matrix multiplication):

def matmul_final_two_dims(tensor1, tensor2):
  # set this to the appropriate value, as map_fn seems to have
  # some dtype inference difficulties:
  _your_dtype_here = tf.float64
  return tf.map_fn(lambda xy: tf.tensordot(xy[0], xy[1], axes=[[-1], [-2]]),
                   elems=(tensor1, tensor2), dtype=_your_dtype_here)

Example usage:

>> batchsize = 3
>> tensor1 = np.random.rand(batchsize,3,4,5,2) # final dims [5,2]
>> tensor2 = np.random.rand(batchsize,2,3,2,4) # final dims [2,4]
>> sess.run(tf.shape(matmul_final_two_dims(tensor1, tensor2)))
array([3, 3, 4, 5, 2, 3, 4], dtype=int32)
>> matmul_final_two_dims(tensor1,tensor2)
<tf.Tensor 'map_1/TensorArrayStack/TensorArrayGatherV3:0' shape=(3, 3, 4, 5, 2, 3, 4) dtype=float64>

Note in particular that the first dimension of the output is the correct batch size and the final 2 in the shape is tensor-contracted out. You will have to do some sort of tf.transpose operation to get the dimension-5 index in the right place, though, as the indices of the output matrix are ordered as they appear in the input tensors.

I'm using TFv1.1. tf.map_fn can be parallelized but I'm not sure if the above is the most efficient implementation. For reference:

tf.tensordot API

tf.map_fn API

EDIT: the above was what worked for me, but I think you can also use an einsum (docs here) to accomplish what you want:

>> tensor1 = tf.constant(np.random.rand(3,4,5))
>> tensor2 = tf.constant(np.random.rand(3,5,7))
>> tf.einsum('bij,bjk->bik', tensor1, tensor2)
<tf.Tensor 'transpose_2:0' shape=(3, 4, 7) dtype=float64>

Upvotes: 7

Related Questions