Paté
Paté

Reputation: 1964

Nested documents and boolean query with Elasticsearch

I'm trying to use a must_not boolean query on nested documents but I keep getting weird results.

Here is an example to illustrate my issue.

curl -X DELETE "http://localhost:9200/must_again/"
curl -X POST "http://localhost:9200/must_again/" -d '{
  "mappings": {
    "class": {
      "properties": {
        "title": {
          "type": "string"
        },
        "teachers": {
          "type": "nested",
          "properties": {
            "name": {
              "type": "string"
            }
          }
        }
      }
    }
  }
}'

curl -XPUT 'http://localhost:9200/must_again/class/1' -d '{
  "title": "class1",
  "teachers": [
    {
      "name": "alex"
    },
    {
      "name": "steve"
    }
  ]
}'

curl -XPUT 'http://localhost:9200/must_again/class/2' -d '{
  "title": "class2",
  "teachers": [
    {
      "name": "alex"
    }
  ]
}'

curl -XPUT 'http://localhost:9200/must_again/class/3' -d '{
  "title": "class3",
  "teachers": []
}'

At this point, I have 3 classes where only where steve is teaching, and one where there is no teacher.

My goal is get the last 2, every class where Steve is not teaching.

The query I was working with is

curl -XGET 'http://localhost:9200/must_again/class/_search' -d '{
  "query": {
    "nested": {
      "path": "teachers",
      "query": {
        "bool": {
          "must_not": [
            {
              "match": {
                "teachers.name": "steve"
              }
            }
          ]
        }
      }
    }
  }
}'

This returns

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "must_again",
        "_type": "class",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "title": "class2",
          "teachers": [
            {
              "name": "alex"
            }
          ]
        }
      },
      {
        "_index": "must_again",
        "_type": "class",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "title": "class1",
          "teachers": [
            {
              "name": "alex"
            },
            {
              "name": "steve"
            }
          ]
        }
      }
    ]
  }
}

So class2 is expected but not class1 and class3 is missing.

If I do the same query with must I do get the right result (only class1).

Not sure what I'm doing wrong?

Upvotes: 4

Views: 218

Answers (1)

progrrammer
progrrammer

Reputation: 4489

A wayaround.

curl -XPOST "http://localhost:9200/must_again/class/_search" -d'
{
   "query": {
      "bool": {
         "must_not": [
            {
               "nested": {
                  "path": "teachers",
                  "query": {
                     "bool": {
                        "must": [
                           {
                              "match": {
                                 "teachers.name": "steve"
                              }
                           }
                        ]
                     }
                  }
               }
            }
         ]
      }
   }
}'

Hope this helps!!

Upvotes: 3

Related Questions