aggregation on fields values (regex)

Question

I am trying to perform an aggregation to group documents by the first two letters of a specific field value.

I successfully aggreated my documents by a specific field name, but i don't know how to work with the values.

For example, for the docs:

[
  {
    "name": "John"
  },
  {
    "name": "Jog"
  },
  {
    "name": "James"
  },
  {
    "name": "Robert"
  },
  {
    "name": "Jessica"
  }
]

I would like to get the following response:

[
  {
    "key": "Jo",
    "doc_count": 2 
  },
  {
    "key": "Ja",
    "doc_count": 1
  },
  {
    "key": "Ro",
    "doc_count": 1
  },
  {
    "key": "Je",
    "doc_count": 1
  }
]

Is there an aggregation query able to do that?

Val · Accepted Answer

You could use a terms aggregation with a script instead of a field, like this:

{
  "size": 0,
  "aggs": {
    "first_two": {
      "terms": {
        "script": "doc.name.value?.size() >=2 ? doc.name.value?.substring(0, 2) : doc.name.value"
      }
    }
  }
}

Note that if your name fields all have at least two characters, the script could simply be doc.name.value?.substring(0, 2). My script above accounts for single character names.

Also make sure to enable dynamic scripting in order for this to work.

aggregation on fields values (regex)

Answers (1)

Related Questions