Abrar
Abrar

Reputation: 631

tf.multiply vs tf.matmul to calculate the dot product

I have a matrix (of vectors) X with shape [3,4], and I want to calculate the dot product between each pair of vectors (X[1].X[1]) and (X[1].X[2])...etc.

I saw a cosine similarity code were they use

tf.reduce_sum(tf.multyply(X, X),axis=1)

to calculate the dot product between the vectors in a matrix of vectors.However, this result in only calculates the dot product between (X[i], X[i]).

I used tf.matmul(X, X, transpose_b=True) which calculate the dot product between every two vectors but I am still confused why tf.multiply didn't do this I think the problem with my code.

the code is:

data=[[1.0,2.0,4.0,5.0],[0.0,6.0,7.0,8.0],[8.0,1.0,1.0,1.0]]
X=tf.constant(data)
matResult=tf.matmul(X, X, transpose_b=True)

multiplyResult=tf.reduce_sum(tf.multiply(X,X),axis=1)
with tf.Session() as sess:
   print('matResult')
   print(sess.run([matResult]))
   print()
   print('multiplyResult')
   print(sess.run([multiplyResult]))

The output is:

matResult
[array([[  46.,   80.,   19.],
       [  80.,  149.,   21.],
       [  19.,   21.,   67.]], dtype=float32)]

multiplyResult
 [array([  46.,  149.,   67.], dtype=float32)]

I would appreciate any advise

Upvotes: 24

Views: 47610

Answers (4)

Prasanth S
Prasanth S

Reputation: 146

For what it is worth, XdotX = tf.matmul(X, X, transpose_b=True) is functionally equivalent to

X_left = tf.expand_dims(X, axis=-2)
X_right = tf.expand_dims(X, axis=-3)
XdotX = tf.reduce_sum(tf.multiply(X_left, X_right), axis=-1, keepdims=False)

If X is (M,N) dimensional:

  • X_left will be (M,1,N) dimensional.
    X_left[i][0][k] will equal X[i][k].
  • X_right will be (1,M,N) dimensional.
    X_right[0][j][k] will equal X[j][k].
  • tf.multiply(X_left, X_right) will be (M,M,N) dimensional.
    tf.multiply(X_left, X_right)[i][j][k] will equal X[i][k]*X[j][k].

Summing over the k index with tf.reduce_sum with axis=-1 will produce the desired result.


Dot product between different matrices X and Y tf.matmul(X, Y, transpose_b=True) can be accomplished the same way, by using using tf.expand_dims with axis=-3 on Y instead of X.

Upvotes: 1

patapouf_ai
patapouf_ai

Reputation: 18723

tf.multiply(X, Y) or the * operator does element-wise multiplication so that:

[[1 2]    [[1 3]      [[1 6]
 [3 4]] .  [2 1]]  =   [6 4]]

wheras tf.matmul does matrix multiplication so that:

[[1 0]    [[1 3]      [[1 3]
 [0 1]] .  [2 1]]  =   [2 1]]

using tf.matmul(X, X, transpose_b=True) means that you are calculating X . X^T where ^T indicates the transposing of the matrix and . is the matrix multiplication.

tf.reduce_sum(_, axis=1) takes the sum along 1st axis (starting counting with 0) which means you are suming the rows:

tf.reduce_sum([[a, b], [c, d]], axis=1) = [a+b, c+d]

This means that:

tf.reduce_sum(tf.multiply(X, X), axis=1) = [X[1].X[1], ..., X[n].X[n]]

so that is the one you want if you only want the norms of each rows. On the other hand:

tf.matmul(X, X, transpose_b=True) = [
                                      [ X[1].X[1], X[1].X[2], ..., X[1].X[n] ], 
                                      [ X[2].X[1], ..., X[2].X[n] ],
                                       ...
                                      [ X[n].X[1], ..., X[n].X[n] ]
                                   ]

so that is what you need if you want the similarity between all pairs of rows.

Upvotes: 50

Ben Usman
Ben Usman

Reputation: 8387

What tf.multiply(X, X) does is essentially multiplying each element of the matrix with itself, like

[[1 2]
 [3 4]]

would turn into

[[1 4]
 [9 16]]

whereas tf.reduce_sum(_, axis=1) takes a sum of each row, so the result for the previous example will be

[5 25]

which is exactly (by definition) equal to [X[0, :] @ X[0, :], X[1, :] @ X[1, :]].

Just put it down with variable names [[a b] [c d]] instead of actual numbers and look at what does tf.matmul(X, X) and tf.multiply(X, X) do.

Upvotes: 5

Seeni
Seeni

Reputation: 1616

In short tf.multiply() does element wise product(dot product). whereas tf.matmul() does actual matrix mutliplication. so tf.multiply() needs arguments of same shape so that element wise product is possible i.e.,shapes are (n,m) and (n,m). But tf.matmul() needs arguments of shape (n,m) and (m,p) so that resulting matrix is (n,p) [ usual math ].

Once understood, this can be applied to Multi-Dimensional matrices easily.

Upvotes: 4

Related Questions