Aggregate document value per hour

Question

I have a question about aggregation. I read about Date Histogram Aggregation. But it only sorts documents by date. So I have index visits with field date and visited_page. And I want to aggregate for example counts per hour(e.g. user visiting page per hour). Will aggregation above should be used or I should somehow aggregate in different way?

deerawan · Accepted Answer

The query is supposed to be like this below:

GET {index_name}/{type}/_search
{
  "size": 0, // no need to display search result, can boost query speed
  "aggs": {
    "unique_visited_page": {
      "terms": {
        "field": "visited_page" // this must be indexed with keyword type
      },
      "aggs": {
        "visit_page_per_hour" : {
          "date_histogram" : {
              "field" : "date_field",
              "interval" : "hour"
          }
        }
      }
    }
  }
}

We aggregate by visited_page first then per each visited_page, we drill down it per hour to get the count.

Example response using my sample data

{
  ...
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "unique_visited_page": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "contact.html",
          "doc_count": 2,
          "visit_page_per_hour": {
            "buckets": [
              {
                "key_as_string": "2018-07-24T14:00:00.000Z",
                "key": 1532440800000,
                "doc_count": 1
              },
              {
                "key_as_string": "2018-07-24T15:00:00.000Z",
                "key": 1532444400000,
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key": "index.html",
          "doc_count": 1,
          "visit_page_per_hour": {
            "buckets": [
              {
                "key_as_string": "2018-07-24T13:00:00.000Z",
                "key": 1532437200000,
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key": "page.html",
          "doc_count": 1,
          "visit_page_per_hour": {
            "buckets": [
              {
                "key_as_string": "2018-07-24T13:00:00.000Z",
                "key": 1532437200000,
                "doc_count": 1
              }
            ]
          }
        }
      ]
    }
  }
}

The key of the result is our visited_page value then it will be aggregated per hour and return the doc_count. The doc_count perhaps the value that you want.

Hope it helps.

Aggregate document value per hour

Answers (2)

Related Questions