Pritam Deka
Pritam Deka

Reputation: 33

Is there a way of creating a cosine similarity matrix between sentence embeddings having different values?

I want to create a cosine similarity matrix of size 7x7 where each element of the matrix will be the cosine similarity of two arrays of size 1024.

[[ 0.1463873   0.6160218  -0.8804966  ...  1.520877    0.09114664
   0.14081596]]
[[ 0.54208326  0.7649026  -1.4366877  ...  1.6818116  -0.20427406
   0.3631045 ]]
[[ 0.32065052  0.67767006 -1.2465438  ...  0.6658634  -0.17746
   0.39568862]]
[[ 0.25573847  0.70055985 -1.1845624  ...  1.4804083  -0.34156996
   0.04723666]]
[[ 0.62882924  1.3213214  -1.4690932  ...  1.3146497  -0.1773764
  -0.4018889 ]]
[[ 0.82711285  1.1108592  -1.1221949  ...  1.4259428  -0.41509023
  -0.03925738]]
[[-0.04750526  0.42094198 -1.2134333  ...  0.7967724  -0.08025895
   0.32510945]]

Suppose these are the 7 arrays of size 1024 each. I want to generate a cosine similarity matrix between each of them such that the size of matrix is 7x7. Is there any way to do it?

Upvotes: 0

Views: 878

Answers (1)

Sahar Millis
Sahar Millis

Reputation: 907

I believe what you are looking for is the cosine_similarity function of sklearn.
Take a look at this.

Also, there is a good example of this on kite.

Call sklearn.metrics.pairwise.cosine_similarity(array) to return an array containing the cosine similarities of the rows of array. The value in the i-th row and j-th column of the result is the cosine similarity between the i-th and j-th row of array.

Upvotes: 0

Related Questions