logdev
logdev

Reputation: 302

Validating my understanding of Dismax query in elasticsearch

I have tried understanding how dismax query works and I want to validate my understanding, please see if I understood it correctly.

According to documentation a dismax query is:

A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.

Suppose, the total documents in our ES cluster be as follows: {"FOO":"ABC"},{"FOO":"XYZ"},{"FOO":"ABC XYZ"},{"FOO":"ABC DEF"},{"FOO":"DEF"} and the dismax query is:

 "dis_max": {
   "queries": [
     {
       "match": {
         "FOO": "ABC"
       }
     },
     {
       "match": {
         "FOO": "XYZ"
       }
     }
   ]
 }
}

So, as per the documentation let us first find out union of documents returned by dismax's sub-queries. The union of documents would be {"FOO":"ABC"},{"FOO":"XYZ"},{"FOO":"ABC XYZ"},{"FOO":"ABC DEF"}. According to the next step we need to score each document with the maximum score for that document as produced by any subquery. Which will be something like:

{"FOO":"ABC"}will be scored on {"match":{"FOO": "ABC"}} and {"match":{"FOO": "XYZ"}} and the maximum score returned will be used. And similarly, {"FOO":"XYZ"}will be scored on {"match":{"FOO": "ABC"}} and {"match":{"FOO": "XYZ"}} and the maximum score returned will be used and this will be done for all the union of documents and finally the documents will be returned in a sorted way.

Is this how dismax query works? Or did I misunderstand or miss out anything?

Upvotes: 3

Views: 1305

Answers (0)

Related Questions