darksigma
darksigma

Reputation: 313

Elasticsearch aggregations over regex matching in a list

My documents in elasticsearch are of the form

{
    ...
    dimensions : list[string]
    ...
}

I'd like to find all dimensions over all the documents that match a regex. I feel like an aggregation would probably do the trick, but I'm having trouble formulating it.

For example, suppose I have three documents as below:

{
    ...
    dimensions : ["alternative", "alto", "hello"]
    ...
}


{
    ...
    dimensions : ["hello", "altar"]
    ...
}


{
    ...
    dimensions : ["nore", "sore"]
    ...
}

I'd like to get the result ["alternative", "alto", "altar"] when I'm querying for the regex "alt.*"

Upvotes: 3

Views: 6100

Answers (1)

Val
Val

Reputation: 217574

You can achieve that with a simple terms aggregation parametrized with an include property which you can use to specify either a regexp (e.g. alt.* in your case) or an array of values to be included in the buckets. Note that there is also the exclude counterpart, if needed:

{
  "size": 0,
  "aggs": {
    "dims": {
      "terms": {
        "field": "dimensions",
        "include": "alt.*"
      }
    }
  }
}

Results:

{
  ...
  "aggregations" : {
    "dims" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "altar",
        "doc_count" : 1
      }, {
        "key" : "alternative",
        "doc_count" : 1
      }, {
        "key" : "alto",
        "doc_count" : 1
      } ]
    }
  }
}

Upvotes: 8

Related Questions