Tornike Menabde
Tornike Menabde

Reputation: 126

I need to get average document count by date in elasticsearch

I want to get average document count by date without getting the whole bunch of buckets data and get average value by hand cause there are years of data and when I group by the date I get too_many_buckets_exception. So my current query is

{
  "query": {
    "bool": {
      "must": [],
      "filter": []
    }
  },
  "aggs": {
    "groupByChannle": {
      "terms": {
        "field": "channel"
      },
      "aggs": {
        "docs_per_day": {
          "date_histogram": {
            "field": "message_date",
            "fixed_interval": "1d"
          }
        }
      }
    }
  }
}

How can I get an average doc count grouped by message_date(day) and channel without taking buckets array of this data

"buckets" : [
              {
                "key_as_string" : "2018-03-17 00:00:00",
                "key" : 1521244800000,
                "doc_count" : 4027
              },
              {
                "key_as_string" : "2018-03-18 00:00:00",
                "key" : 1521331200000,
                "doc_count" : 10133
              },
...thousands of rows
]

my index structure looks like this

  "mappings" : {
      "properties" : {
        "channel" : {
          "type" : "keyword"
        }, 
        "message" : {
          "type" : "text"
        },
        "message_date" : {
          "type" : "date",
          "format" : "yyyy-MM-dd HH:mm:ss"
        },
      }
    }

By this query, I want to get JUST A AVERAGE DOC COUNT BY DATE and nothing else

Upvotes: 1

Views: 1641

Answers (2)

Vakhtang
Vakhtang

Reputation: 431

I think, that you can use stats aggregation with the script :

{
  "size": 0,
  "aggs": {
    "term": {
      "terms": {
        "field": "chanel"
      },
      "aggs": {
        "stats": {
          "stats": {
            "field": "message_date"
          }
        },
        "result": {
          "bucket_script": {
            "buckets_path": {
              "max" : "stats.max",
              "min" : "stats.min",
              "count" : "stats.count"
            },
            "script": "params.count/(params.max - params.min)/1000/86400)"
          }
        }
      }
    }
  }
}

Upvotes: 1

Gibbs
Gibbs

Reputation: 22974

"avg_count": {
  "avg_bucket": {
    "buckets_path": "docs_per_day>_count"
  }
}

after docs_per_day ending this.

avg_count provides average count. _count refers the bucket count

Upvotes: 2

Related Questions