CamiloSalomon
CamiloSalomon

Reputation: 106

ElasticSearch: Nested buckets aggregation

I'm new to ElasticSearch, so this question could be quite trivial for you, but here I go:

I'm using kibana_sample_data_ecommerce, which documents have a mapping like this

{
    ...
    "order_date" : <datetime>
    "taxful_total_price" : <double>
    ...
}

I want to get a basic daily behavior of the data:

Sales a day behavior

Expecting documents like this:

[
  {
    "qtime" : "00:00",
    "mean" : 20,
    "std" : 40
  },
  {
    "qtime" : "01:00",
    "mean" : 150,
    "std" : 64
  }, 
  ...
]

So, the process I think that I need to do is:

Group by day all records -> 
  Group by time window for each day -> 
    Sum all record in each time window -> 
      Cumulative Sum for each sum by time window, thus, I get behavior of a day ->
        Extended_stats by the same time window across all days

And that can be expressed like this:

Nested bucket aggregation

But I can't unwrap those buckets to process those statistics. May you give me some advice to do that operation and get that result?

Here is my current query(kibana developer tools):

POST kibana_sample_data_ecommerce/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "order_date": {
              "gt": "now-1M",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "day_histo": {
      "date_histogram": {
        "field": "order_date",
        "calendar_interval": "day"
      },
      "aggs": {
        "qmin_histo": {
          "date_histogram": {
            "field": "order_date",
            "calendar_interval": "hour"
          },
          "aggs": {
            "qminute_sum": {
              "sum": {
                "field": "taxful_total_price"
              }
            },
            "cumulative_qminute_sum": {
              "cumulative_sum": {
                "buckets_path": "qminute_sum"
              }
            }
          }
        }
      }
    }
  }
}

Upvotes: 7

Views: 976

Answers (1)

Joe - Check out my books
Joe - Check out my books

Reputation: 16895

Here's how you pull off the extended stats:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "order_date": {
              "gt": "now-4M",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "by_day": {
      "date_histogram": {
        "field": "order_date",
        "calendar_interval": "day"
      },
      "aggs": {
        "by_hour": {
          "date_histogram": {
            "field": "order_date",
            "calendar_interval": "hour"
          },
          "aggs": {
            "by_taxful_total_price": {
              "extended_stats": {
                "field": "taxful_total_price"
              }
            }
          }
        }
      }
    }
  }
}

yielding

enter image description here

Upvotes: 1

Related Questions