gust
gust

Reputation: 945

Compute all cosine similarities in a matrix

Say I have a matrix mat an 100 x 200 array.

My question is twofold:

  1. How can I compute the cosine similarity of the first row against all the other rows? I tried using sklearn's cosine_similarity function but passing in a 100 x 200 matrix gives me a 100 x 100 array (instead of a 100 x 1 array).

  2. If I wanted to compute the cosine similarities of all the rows against the others, say compute all 100 C 2 = 4950 different combinations of all the rows, would it be fastest not to use something like sklearn but actually store the norms of each of the rows by np.linalg.norm and then compute each similarity by cos_sim = dot(a, b)/(norm(a)*norm(b))?

Upvotes: 0

Views: 921

Answers (1)

Majd Al-okeh
Majd Al-okeh

Reputation: 136

1- try:

cosines = (numpy.inner(mat[0], mat) / (numpy.linalg.norm(mat[0]) * numpy.linalg.norm(mat, axis=1)))

2- you can check the previous code to do similar thing knowing that

numpy.linalg.norm(mat, axis=1)

computing the norms of all vectors and then you multiply by the current one for each step. also

numpy.inner(mat, mat)

will give you a symmetric matrix of the matrix inner product.

Upvotes: 1

Related Questions