Steve
Steve

Reputation: 4463

multi_match vs should match vs must query_string in ElasticSearch

I tried these type of queries in ElasticSearch and wondering which type is the most suitable (most accurate and most efficient) one. Basically, one person can have multiple set of names (array). Names split into firstname, surname and middlename. Some person can have just firstname and surname. Parameter (input) is fullname (combination of firstname, surname and middlename in one string). Fuzzy logic added. One difference I notice is the score.

This is the score of the first result returned.

Is this mean that the second query is the most accurate query for this requirement?

GET /person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "David Bill Gonzalo~",
            "fields": [
              "nameDetails.name.nameValue.firstName",
              "nameDetails.name.nameValue.surname",
              "nameDetails.name.nameValue.middleName"
            ]
          }
        }
      ]
    }
  }
}

GET /person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "nameDetails.name.nameValue.firstName": "David Bill Gonzalo~"
          }
        },
        {
          "match": {
            "nameDetails.name.nameValue.surname": "David Bill Gonzalo~"
          }
        },
        {
          "match": {
            "nameDetails.name.nameValue.middleName": "David Bill Gonzalo~"
          }
        }
      ]
    }
  }
}



GET /person/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "fields": [
              "nameDetails.name.nameValue.firstName",
              "nameDetails.name.nameValue.surname",
              "nameDetails.name.nameValue.middleName"
            ],
            "query": "David Bill Gonzalo~"
          }
        }
      ]
    }
  }
}

Upvotes: 1

Views: 1561

Answers (1)

Bhavya
Bhavya

Reputation: 16172

First Query:

The multi-match query allows us to run a query on multiple fields. It is an extension of the match query.

As in the first query, you have not specified any type parameter, so by default best_fields is considered the type. This finds all the documents which match with the query, but _score is calculated only from the best field.

To know more about the types of multi-match queries, refer to this part of the documentation.


Second Query:

This is a boolean query with the combination of the bool/should clause. The score from each matching should clause is taken to calculate the final score here.


Third Query:

In the third query, query_string is running against multiple fields.

As you have not specified any type parameter, so by default best_fields is considered the type. This finds all the documents which match with the query, but _score is calculated only from the best field.


Since you are querying on multiple fields, with the same query parameter i.e "David Bill Gonzalo~", according to me you should use a multi-match query. You can use multi-match queries with different options as well like boosting one or more fields, adding type parameter in multi-match queries, etc.

Upvotes: 3

Related Questions