Ludo
Ludo

Reputation: 5280

aggregation on fields values (regex)

I am trying to perform an aggregation to group documents by the first two letters of a specific field value.

I successfully aggreated my documents by a specific field name, but i don't know how to work with the values.

For example, for the docs:

[
  {
    "name": "John"
  },
  {
    "name": "Jog"
  },
  {
    "name": "James"
  },
  {
    "name": "Robert"
  },
  {
    "name": "Jessica"
  }
]

I would like to get the following response:

[
  {
    "key": "Jo",
    "doc_count": 2 
  },
  {
    "key": "Ja",
    "doc_count": 1
  },
  {
    "key": "Ro",
    "doc_count": 1
  },
  {
    "key": "Je",
    "doc_count": 1
  }
]

Is there an aggregation query able to do that?

Upvotes: 1

Views: 592

Answers (1)

Val
Val

Reputation: 217564

You could use a terms aggregation with a script instead of a field, like this:

{
  "size": 0,
  "aggs": {
    "first_two": {
      "terms": {
        "script": "doc.name.value?.size() >=2 ? doc.name.value?.substring(0, 2) : doc.name.value"
      }
    }
  }
}

Note that if your name fields all have at least two characters, the script could simply be doc.name.value?.substring(0, 2). My script above accounts for single character names.

Also make sure to enable dynamic scripting in order for this to work.

Upvotes: 2

Related Questions