0xdeface
0xdeface

Reputation: 259

Solr search relevancy

i use solr and i have a trouble with result score. For example i have such docs with one field (for example "content"):

  1. content = car
  2. content = cars
  3. content = carable awesome
  4. content = awful for carable

And i make search query with such params ":{ "mm":"1", "q":"car", "tie":"0.1", "defType":"dismax", "fl":"*, score",}

i expect to see the result like this:

Word without "s" should be highter, but i have strange things. How i can boost absolute match (like a car)

Upvotes: 1

Views: 36

Answers (1)

MatsLindh
MatsLindh

Reputation: 52912

This happens because the field type you're using for the field has a stemming filter (or an ngramfilter) attached (which makes cars and car generate hits against each other). You can't boost "exact hits" inside such a field, since for Lucene they are the same value. What's stored in the index is the same for both car and cars - the latter is processed down to car as well.

To implement this and get exact hits higher, you add a second field without that filter present that only tokenizes (splits) your content on whitespace and lowercases the token. That way you have a field where cars and car are stored as different tokens, and tokens won't contribute to the score if they're not being matched.

You can use qf in Solr to tell Solr which fields you want to search against, and you can give a boost at the same time - so in your case you'd have qf=exact_field^10 text_field where hits in exact_field would be valued ten times higher than hits in the regular field (the exact boost values will depend on your use case and how you want the query profile to behave).

You can also use the different boost arguments (bq and boost) to apply boosts outside of your regular query (i.e. add a query to bq that replicates your original query), but the previous suggestion will probably work just fine.

Upvotes: 3

Related Questions