JustAC0der
JustAC0der

Reputation: 3169

Apache Solr: sort by number of fields matching the query

In my SOLR index I have documents that consist of many fields: name, title, description, tags etc. I would like to sort my documents by number of fields that match a query, but do not want to take into account how many times a matching term appears in a field (so no TFIDF, no BM25).

For example:

Documents:
ID: 100, title: "foo foo bar bar", name: "foo bar"
ID: 101, title: "foo bar", name: "gibberish foo"
ID: 102, title: "foo bar", name: "foo bar"

And when I search for "foo bar", I would like the results to be sorted in that order:

  1. 102 (two fields matching)
  2. 100 (also two fields matching, so 100 and 102 should be scored exactly the same)
  3. 101 (one field matching)

How can I achieve this with SOLR? What should be the sort clause?

Upvotes: 1

Views: 921

Answers (1)

drjz
drjz

Reputation: 657

You could try to disable term frequencies by using a constant score query like (tags:stack)^=1. Note that this is a special syntax. Then use the eDisMax query parser and specify in qf the fields, possibly with some boosting, and specify a value for tie like 1. This you need to make sure all subqueries (fields) are used for calculating the score.

This should do what you want, as more matching fields will result in a higher score, and term frequencies no longer used for scoring.

Upvotes: 2

Related Questions