Sheikhah Maasher
Sheikhah Maasher

Reputation: 443

ElasticSearch query not returns exact match of an array

I have a question regarding the query of an array of Elasticsearch. in my case, the structure of the custom attributes is an array of objects, each contains inner_name and value, the type of value are mixed (could be string, number, array, Date..etc), and one of the types is multi-checkbox, where it should take an array as an input. Mapping the custom_attributes as below:

"attributes" : {
              "properties" : {
                "_id" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "inner_name" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "value" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },

Where I used mongoosastic to indexing my MongoDB to ES, so the structure of the custom attributes was like:

[
  {
    customer_name: "customerX",
    "custom_attributes" : [
      {
        "group" : "xyz",
        "attributes" : [
          {
            "inner_name" : "attr1",
            "value" : 123,
          },       
          {
            "inner_name" : "attr2",
            "value" : [
              "Val1",
              "Val2",
              "Val3",
              "Val4"
            ]
          }
        ]
      }
    ]
  },
  {
    customer_name: "customerY",
    "custom_attributes" : [
      {
        "group" : "xyz",
        "attributes" : [
          {
          "inner_name" : "attr2",
            "value" : [
              "Val1",
              "Val2"
            ]
          }
        ]
      }
    ]
  }
]

I want to perform a query where all values must be in the array. However, the problem with the below query is that it returns the document whenever it contains any of the values in the array. Here's the query:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "terms": {
            "custom_attributes.attributes.value": [
              "val1",
              "val2",
              "val3",
              "val4"
            ]
          }
        }
      ]
    }
  }
}

For example, it returns both documents, where it should return just the first one only! what is wrong with my query? Is there another way to write the query?

Upvotes: 1

Views: 548

Answers (1)

Alkis Kalogeris
Alkis Kalogeris

Reputation: 17745

The elasticsearch terms query tries to match any if any of your values is present in a document, think of the operator being OR instead of AND which is the one you want. There are two solutions for this

  1. Use multiple term queries inside your bool must query, which will provide the needed AND functionality
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val1"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val2"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val3"
          }
        },
        {
          "term": {
            "custom_attributes.attributes.value": "val4"
          }
        }
      ]
    }
  }
}
  1. Use the match query with the operator AND and whitespace analyzer. This WILL NOT work if your terms contain a whitespace
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "custom_attributes.attributes.inner_name": "attr2"
          }
        },
        {
          "match": {
            "custom_attributes.attributes.value": {
              "query": "val1 val2 val3 val4",
              "operator": "and",
              "analyzer": "whitespace"
            }
          }
        }
      ]
    }
  }
}

Upvotes: 1

Related Questions