rnjai
rnjai

Reputation: 1135

Solr - Why are scores of documents different although the query has not differentiated between them

I have put the following queries below to get this response -

"response":{"numFound":200,"start":0,"maxScore":20.458012,"docs":[
      {
        "food_group":"Dairy",
        "carbs":"13.635",
        "protein":"2.625",
        "name":"Apple Milkshake",
        "fat":"3.814",
        "id":"109",
        "calories":99.0,
        "_version_":1565386306583789568,
        "score":20.458012},
      {
        "food_group":"Proteins",
        "carbs":"4.79",
        "protein":"4.574",
        "name":"Chettinad Egg Curry",
        "fat":"6.876",
        "id":"526",
        "calories":99.0,
        "_version_":1565386306489417728,
        "score":19.107327}
.....//other documents...
]}        

Querys -

q = (food_group:"Proteins"  OR
food_group:"Dairy"  OR
food_group:"Grains")

bf = div(1,abs(sub(100,calories)))^15
bq = food_group:"Proteins" + food_group:"Dairy" + food_group:"Grains"

My question is that even though i have not provided any boost to "Dairy" with respect to "Proteins" in bq why is the "Dairy" document having higher score.

Upvotes: 0

Views: 168

Answers (1)

Persimmonium
Persimmonium

Reputation: 15789

because "Dairy" is a more rare term in your corpus. Lucene will give a higher score to a match with a term that is rare vs a match with a very common term.

If you want to get into the detials, look up how BM25 similarity is computed. BM25 is what Lucene (thus Solr) uses now by default, before it was TD-IDF, but they are very similar.

Upvotes: 1

Related Questions