Reputation: 9170
What is the similarity score in the genism similar_by_word function?
I was reading here about the genism similar_by_word function: https://radimrehurek.com/gensim/models/keyedvectors.html
The similar_by_word function returns a sequence of (word, similarity). What is the definition by similarity here and how is it calculated?
Upvotes: 0
Views: 644
Reputation: 3588
The similarity measure used here is the cosine similarity, which takes values between -1 and 1. The cosine similarity measures the (cosine of) the angle between two vectors. If the angle is very small the vectors are considered similar since they are pointing in the same direction. This way of measuring similarity is common when working with high dimensional vector spaces such as word embeddings.
The formula for the cosine similarity of two vectors A
and B
is as follows:
Upvotes: 1