Reputation: 19
I'm using elasticsearch to find similar documents to a given document using the "more like this" query.
Is there an easy way to get the elasticsearch scoring between 0 and 1 (using cosine similarity) ?
Thanks!
Upvotes: 2
Views: 2930
Reputation: 249
The Elasticsearch uses the Boolean model to find matching documents, and a formula called the practical scoring function to calculate relevance. This formula borrows concepts from term frequency/inverse document frequency and the vector space model but adds more-modern features like a coordination factor, field length normalization, and term or query clause boosting.
Upvotes: 0
Reputation: 344
You may want to look into the Function Score functionality of Elasticsearch, more specifically the script_score
and field_value_factor
functions. This will allow you to take the score from default scoring (_score
) and enhance or replace it in other ways. It really depends on what sort of boosting or transformation you'd like. The default scoring model takes into account the Vector model but other things as well .
Upvotes: 2
Reputation: 14077
I don't think that's possible to retrieve directly.
But perhaps this workaround would make sense?
Elasticsearch always bring back max_score
in hits
document.
You can potentially divide your document _score
by max_score
. Report with highest value will score as 1, documents, that are not so like given one, will score less.
Upvotes: 0