sukumar c
sukumar c

Reputation: 31

Elastic Search bucket script for percentile aggregated values

Below is my problem statement I have one search call to elastic-search which has query to calculate 99% percentile aggregations on one of the field. In return i am getting aggregated response whose values are percentile calculated. But again I need to apply filter on percentile aggregated value, using "bucket_selector" to filter out the values. For instance, if the percentile aggregated value is > 60 then i need to include in my response. Below is my sample aggregation request json:

        {
      "aggs": {
        "2": {
           "terms": {
           "field": "component",
           "size": 500,
           "order": {
           "1": "desc"
          }
         },
         "aggs": {
              "1": {
                   "percentiles": {
                       "field": "field1",
                        "percents": [
                            99
                         ],
                  "keyed": false
                   }
              },
        "filter_gt_than_60sec": {
          "bucket_selector": {
            "buckets_path": {
              "value": "1"
            },
            "script": "params.value > 60L"
          }
        }
      }
      }
     },
      "size": 0,
      "_source": {
        "excludes": []
      },
      "stored_fields": [
        "*"
      ],
      "script_fields": {},
      "query": {
        "bool": {
          "must": [
            {
              "match_all": {}
            },
            {
              "range": {
                "@timestamp": {
                  "gte": 1547889125683,
                  "lte": 1547975525684,
                  "format": "epoch_millis"
                }
              }
            }
          ],
          "filter": [],
          "should": [],
          "must_not": []
        }
      },
      "timeout": "30000ms"
     }

Error i am getting:

        {
            "error": {
                "root_cause": [],
                "type": "search_phase_execution_exception",
                "reason": "",
                "phase": "fetch",
                "grouped": true,
                "failed_shards": [],
                "caused_by": {
                    "type": "aggregation_execution_exception",
                    "reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.InternalTDigestPercentiles"
                }
            },
            "status": 503
        }

Sample response mapping document if no bucket selectors are applied:

    {
      "aggregations": {
        "2": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "1": {
                "values": [
                  {
                    "key": 99,
                    "value": 70
                  }
                ]
              },
              "key": "abc"
            },
            {
              "1": {
                "values": [
                  {
                    "key": 99,
                    "value": 10
                  }
                ]
              },
            "key": "abc1"
        }
        ]
    }}}

I understood from the above error is that, I can't apply "bucket_selector" on percentile fields, Then how can i filter out the percentile aggregated fields whose values are greater than 60. I read about "percentile_bucket" but it is to calculate percentiles on field values; but it is not filtering out on the aggregated percentile fields. Thanks in advance.

Upvotes: 1

Views: 2015

Answers (1)

sukumar c
sukumar c

Reputation: 31

Thanks, Issue is resolved now, and able to access percentile value field by replacing current post request with below buckets_path code:

       "bucket_selector": {
        "buckets_path": {
          "value": "1[99.0]"
        },
        "script": "params.value > 60L"
      }

Upvotes: 2

Related Questions