How to filter by sub-aggregated results in Elasticsearch

I've got the following elastic search query in order to get the number of product sales per hour grouped by product id and hour of sale.

POST /my_sales/_search?size=0
{
  "aggs": {
    "sales_per_hour": {
      "date_histogram": {
        "field": "event_time",
        "fixed_interval": "1h",
        "format": "yyyy-MM-dd:HH:mm"
      },
      "aggs": {
        "sales_per_hour_per_product": {
          "terms": {
            "field": "name.keyword"
          }
        }
      }
    }
  }
}

One example of data :

{
    "@timestamp" : "2020-10-29T18:09:56.921Z",
    "name" : "my-beautifull_product",
    "event_time" : "2020-10-17T08:01:33.397Z"
}

This query returns several buckets (one per hour and per product) but i would like to only retrieve those who have a doc_count higher than 10 for example, is it possible ? For those results i would like to know the id of the product and the event_time bucket.

Thanks for your help.

Upvotes: 1

Views: 310

Answers (3)

Thank you for your help this is not far from what i would like but not exactly ; with the bucket selector i have something like this :

"aggregations" : {
    "sales_per_hour" : {
      "buckets" : [
        {
          "key_as_string" : "2020-08-31:23:00",
          "key" : 1598914800000,
          "doc_count" : 16,
          "sales_per_hour_per_product" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
          {
              "key" : "my_product_1",
              "doc_count" : 2
          },
          {
              "key" : "my_product_2",
              "doc_count" : 2
          },
          {
              "key" : "myproduct_3",
              "doc_count" : 12
          }
        ]
      }
    }
    ]
}

And sometimes none of the buckets are greater than 10, is it possible to have the same thing but with the filter on _count applied to the second level aggregation (sales_per_hour_per_product) and not on the first level (sales_per_hour) ?

Upvotes: 0

Bhavya
Bhavya

Reputation: 16192

Try out this below search query:

{
  "aggs": {
    "sales_per_hour": {
      "date_histogram": {
        "field": "event_time",
        "fixed_interval": "1h",
        "format": "yyyy-MM-dd:HH:mm"
      },
      "aggs": {
        "sales_per_hour_per_product": {
          "terms": {
            "field": "name.keyword"
          },
          "aggs": {
            "the_filter": {
              "bucket_selector": {
                "buckets_path": {
                  "the_doc_count": "_count"
                },
                "script": "params.the_doc_count > 10"
              }
            }
          }
        }
      }
    }
  }
}

It will filter out all the documents, whose count is greater than 10 based on "params.the_doc_count > 10"

Upvotes: 1

Ricardo Ferreira
Ricardo Ferreira

Reputation: 1441

Perhaps using the Bucket Selector feature will help on filtering out the results.

Upvotes: 1

Related Questions