Reputation: 631
I have a matrix (of vectors) X with shape [3,4], and I want to calculate the dot product between each pair of vectors (X[1].X[1]) and (X[1].X[2])...etc.
I saw a cosine similarity code were they use
tf.reduce_sum(tf.multyply(X, X),axis=1)
to calculate the dot product between the vectors in a matrix of vectors.However, this result in only calculates the dot product between (X[i], X[i]).
I used tf.matmul(X, X, transpose_b=True) which calculate the dot product between every two vectors but I am still confused why tf.multiply didn't do this I think the problem with my code.
the code is:
data=[[1.0,2.0,4.0,5.0],[0.0,6.0,7.0,8.0],[8.0,1.0,1.0,1.0]]
X=tf.constant(data)
matResult=tf.matmul(X, X, transpose_b=True)
multiplyResult=tf.reduce_sum(tf.multiply(X,X),axis=1)
with tf.Session() as sess:
print('matResult')
print(sess.run([matResult]))
print()
print('multiplyResult')
print(sess.run([multiplyResult]))
The output is:
matResult
[array([[ 46., 80., 19.],
[ 80., 149., 21.],
[ 19., 21., 67.]], dtype=float32)]
multiplyResult
[array([ 46., 149., 67.], dtype=float32)]
I would appreciate any advise
Upvotes: 24
Views: 47610
Reputation: 146
For what it is worth, XdotX = tf.matmul(X, X, transpose_b=True)
is functionally equivalent to
X_left = tf.expand_dims(X, axis=-2)
X_right = tf.expand_dims(X, axis=-3)
XdotX = tf.reduce_sum(tf.multiply(X_left, X_right), axis=-1, keepdims=False)
If X
is (M,N)
dimensional:
X_left
will be (M,1,N)
dimensional.X_left[i][0][k]
will equal X[i][k]
.X_right
will be (1,M,N)
dimensional.X_right[0][j][k]
will equal X[j][k]
.tf.multiply(X_left, X_right)
will be (M,M,N)
dimensional.tf.multiply(X_left, X_right)[i][j][k]
will equal X[i][k]*X[j][k]
.Summing over the k
index with tf.reduce_sum
with axis=-1
will produce the desired result.
Dot product between different matrices X
and Y
tf.matmul(X, Y, transpose_b=True)
can be accomplished the same way, by using using tf.expand_dims
with axis=-3
on Y
instead of X.
Upvotes: 1
Reputation: 18723
tf.multiply(X, Y)
or the *
operator does element-wise multiplication so that:
[[1 2] [[1 3] [[1 6]
[3 4]] . [2 1]] = [6 4]]
wheras tf.matmul
does matrix multiplication so that:
[[1 0] [[1 3] [[1 3]
[0 1]] . [2 1]] = [2 1]]
using tf.matmul(X, X, transpose_b=True)
means that you are calculating X . X^T
where ^T
indicates the transposing of the matrix and .
is the matrix multiplication.
tf.reduce_sum(_, axis=1)
takes the sum along 1st axis (starting counting with 0) which means you are suming the rows:
tf.reduce_sum([[a, b], [c, d]], axis=1) = [a+b, c+d]
This means that:
tf.reduce_sum(tf.multiply(X, X), axis=1) = [X[1].X[1], ..., X[n].X[n]]
so that is the one you want if you only want the norms of each rows. On the other hand:
tf.matmul(X, X, transpose_b=True) = [
[ X[1].X[1], X[1].X[2], ..., X[1].X[n] ],
[ X[2].X[1], ..., X[2].X[n] ],
...
[ X[n].X[1], ..., X[n].X[n] ]
]
so that is what you need if you want the similarity between all pairs of rows.
Upvotes: 50
Reputation: 8387
What tf.multiply(X, X)
does is essentially multiplying each element of the matrix with itself, like
[[1 2]
[3 4]]
would turn into
[[1 4]
[9 16]]
whereas tf.reduce_sum(_, axis=1)
takes a sum of each row, so the result for the previous example will be
[5 25]
which is exactly (by definition) equal to [X[0, :] @ X[0, :], X[1, :] @ X[1, :]]
.
Just put it down with variable names [[a b] [c d]]
instead of actual numbers and look at what does tf.matmul(X, X)
and tf.multiply(X, X)
do.
Upvotes: 5
Reputation: 1616
In short tf.multiply() does element wise product(dot product). whereas tf.matmul() does actual matrix mutliplication. so tf.multiply() needs arguments of same shape so that element wise product is possible i.e.,shapes are (n,m) and (n,m). But tf.matmul() needs arguments of shape (n,m) and (m,p) so that resulting matrix is (n,p) [ usual math ].
Once understood, this can be applied to Multi-Dimensional matrices easily.
Upvotes: 4