Reputation: 4774
I want to insert another score factor in Lucene's similarity equation. The problem is that I can't just override Similarity class, as it is unaware of the document and terms it is computing scores.
For example, in a document with the text below:
The cat is in the top of the tree, and he is going to stay there.
I have an algorithm of my own, that assigns for each one the terms in this document a score regarding how much each one of them are important to the document as whole. A possible score for each word is:
cat: 0.789212
tree: 0.633423
top: 0.412315
stay: 0.123912
there: 0.0999842
going: 0.00988412
...
The score for each word is different from document to document. For example, in another document cat
could have score: 0.0023912
I want to add this score to the Lucene's scoring, but I'm kind of lost on how to do that.
Any tips?
Upvotes: 2
Views: 1361
Reputation: 12853
Use Lucene's Payload feature:
From: http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
- Add a Payload to one or more Tokens during indexing.
- Override the Similarity class to handle scoring payloads
- Use a Payload aware Query during your search
Upvotes: 5