Reputation: 501
We're having some relevance issues with Solr results. In this particular example we have product A showing up above product B. Product A's title contains the search term. Product B's title also contains the search term along with its Description and Category Name. So logically, Product B should be more relevant and appear above Product A, but it does not.
The schema is configured to take all of these extra fields into account. After analyzing the debug info of the query with ...&debugQuery=true&debug.explain.structured=true
it appears that both products have achieved the same score. Looking further, I can see these extra fields having scores calculated, but for some reason, the parser only takes the maximum of these scores instead of the sum which causes it to be the same:
Is there a reason that Solr behaves this way? Is there any way to change this behavior to use the sum instead of the max? (Just like in the parent element in the images)
Upvotes: 2
Views: 2130
Reputation: 16095
You can control how the score is calculated using the tie
parameter, provided that you are using Dismax/eDismax query parser.
Solr documentation explains it very well :
The tie parameter specifies a float value (which should be something much less than 1) to use as tiebreaker in DisMax queries.
When a term from the user’s input is tested against multiple fields, more than one field may match. If so, each field will generate a different score based on how common that word is in that field (for each document relative to all other documents).
The tie parameter lets you control how much the final score of the query will be influenced by the scores of the lower scoring fields compared to the highest scoring field.
A value of "0.0" - the default - makes the query a pure "disjunction max query": that is, only the maximum scoring subquery contributes to the final score.
A value of "1.0" makes the query a pure "disjunction sum query" where it doesn’t matter what the maximum scoring sub query is, because the final score will be the sum of the subquery scores. Typically a low value, such as 0.1, is useful.
Upvotes: 2