Mark
Mark

Reputation: 3738

Interpreting minimum_should_match for elasticsearch

I'm very new to elasticsearch query, I'm hoping to get some clarification with the following query of an existing source code that I'm looking at.

body["query"]["bool"]["should"] = [
    {"match": {"categories": {"query": raw_query, "operator": "and"}}},
    ....,
    ....,
    {"match": {"all": {"query": raw_query, "minimum_should_match": "50%" if keywords else "2<80%"}}}
]
body["query"]["bool"]["minimum_should_match"] = 1

My understanding of minimum_should_match is that it specifies the number of minimum matching words to the query. For example, for the following, any 2 of the 3 words young, transformation and Egyptian satisfies as a match for the description field.

"query":{
    "match":{
        "description":{
          "query" : "young transformation Egyptian",
          "minimum_should_match" : 2
        } 
    }
  }

From the source code I understand "minimum_should_match": "50%" that means as long as half of the words in raw_query matches what is in the field all if there are keywords. What confuses me a bit is 2<80%. I've read the docs but I'm still confused.

From the docs, https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-minimum-should-match.html, it gave an example of 3<90% and says:

if there are 1 to 3 clauses they are all required, but for 4 or more clauses only 90% are required.

What exactly is the clause? I would think the clause is every match statement in this case but from the source code, this is placed within a single match clause. In that case, how can it ever have more than one clause? My understanding is obviously incorrect.

The last part I need confirmation on is:

body["query"]["bool"]["minimum_should_match"] = 1

Since it is placed outside of should, does that mean only a single match from body["query"]["bool"]["should"] is required?

Upvotes: 2

Views: 707

Answers (1)

ilvar
ilvar

Reputation: 5841

My understanding of minimum_should_match is that it specifies the number of minimum matching words to the query.

Not always. Here minimum_should_match applies not to specific full-text queries but to the bool query and controls how many of should clauses should trigger.

If minimum_should_match is applied to a specific match query then yes, it will control how many tokens (words) should match.

Upvotes: 2

Related Questions