pr338
pr338

Reputation: 9170

What is the similarity score in the gensim similar_by_word function?

What is the similarity score in the genism similar_by_word function?

I was reading here about the genism similar_by_word function: https://radimrehurek.com/gensim/models/keyedvectors.html

The similar_by_word function returns a sequence of (word, similarity). What is the definition by similarity here and how is it calculated?

Upvotes: 0

Views: 644

Answers (1)

Anna Krogager
Anna Krogager

Reputation: 3588

The similarity measure used here is the cosine similarity, which takes values between -1 and 1. The cosine similarity measures the (cosine of) the angle between two vectors. If the angle is very small the vectors are considered similar since they are pointing in the same direction. This way of measuring similarity is common when working with high dimensional vector spaces such as word embeddings.

The formula for the cosine similarity of two vectors A and B is as follows:

cosine similarity formula

Upvotes: 1

Related Questions