Reputation: 1135
I have put the following queries below to get this response -
"response":{"numFound":200,"start":0,"maxScore":20.458012,"docs":[
{
"food_group":"Dairy",
"carbs":"13.635",
"protein":"2.625",
"name":"Apple Milkshake",
"fat":"3.814",
"id":"109",
"calories":99.0,
"_version_":1565386306583789568,
"score":20.458012},
{
"food_group":"Proteins",
"carbs":"4.79",
"protein":"4.574",
"name":"Chettinad Egg Curry",
"fat":"6.876",
"id":"526",
"calories":99.0,
"_version_":1565386306489417728,
"score":19.107327}
.....//other documents...
]}
Querys -
q = (food_group:"Proteins" OR
food_group:"Dairy" OR
food_group:"Grains")
bf = div(1,abs(sub(100,calories)))^15
bq = food_group:"Proteins" + food_group:"Dairy" + food_group:"Grains"
My question is that even though i have not provided any boost to "Dairy" with respect to "Proteins" in bq
why is the "Dairy" document having higher score.
Upvotes: 0
Views: 168
Reputation: 15789
because "Dairy" is a more rare term in your corpus. Lucene will give a higher score to a match with a term that is rare vs a match with a very common term.
If you want to get into the detials, look up how BM25 similarity is computed. BM25 is what Lucene (thus Solr) uses now by default, before it was TD-IDF, but they are very similar.
Upvotes: 1