Pampa Nello
Pampa Nello

Reputation: 242

how to count the total number of documents that have more than one object in an Elasticsearch field array?

The structure of the documents in my index is similar to:

{
  "_index": "blabla",
  "_type": "_doc",
  "_source": {
    "uid": 5366492,
    "aField": "Hildegard",
    "aNestedField": [{
        "prop": {
          "data": "xxxxxxx"
        }
      },
      {
        "prop": {
          "data": "yyyyyyyy"
        }
      }
    ]
  }
}

I would like to have the total number of documents in the whole index that have more than one object in the aNestedField field. So, the above one will be counted, because it has 2.

If my index has 100 documents, and the above one is the only one with more than 2 objects in that field, I would expect to have my query to return 1.

Is there a way of doing it?


Updated after having read the comments.

The mapping for the field is:

{
  "aNestedField": {
    "properties": {
      "prop": {
        "properties": {
          "data": {
            "type": "text",
            "index": false
          }
        }
      }
    }
  }
}

The data will not be updated often, no need to worry about it.

Upvotes: 1

Views: 576

Answers (1)

Joe - Check out my books
Joe - Check out my books

Reputation: 16943

Since the prop.data field is not being indexed ("index": false), you'll need at least one field inside of each aNestedField object that is being indexed -- either by explicitly setting "index": true or by not setting "index": false in its mapping.

Your docs should then look something like this:

{
  "uid": 5366492,
  "aField": "Hildegard",
  "aNestedField": [
    {
      "id": 1,    <--
      "prop": {
        "data": "xxxxxxx"
      }
    },
    {
      "id": 2,    <--
      "prop": {
        "data": "yyyyyyyy"
      }
    },
    {
      "id": 3,    <--
      "prop": {
        "data": "yyyyyyyy"
      }
    }
  ]
}

id is arbitrary -- use anything that makes sense.

After that you'll be able to query for docs with more than 2 array objects using:

GET /_search
{
  "query": {
    "script": {
      "script": "doc['aNestedField.id'].size() > 2"
    }
  }
}

Upvotes: 2

Related Questions