Jacob
Jacob

Reputation: 3

Elasticsearch Aggregation and max count-type operation

I have some documents stored in Elasticsearch like the following:

{
          "date" : 1,
          "field1" : 0.2,
          "field2" : 0.5,
          "field3" : 0.3
},
{
          "date" : 1,
          "field1" : 0.9,
          "field2" : 0.5,
          "field3" : 0.1
},
{
          "date" : 2,
          "field1" : 0.2,
          "field2" : 0.6,
          "field3" : 0.7
}

and what I'd like to get is a count for how many times each of field1, field2, or field3 were the greatest for each document, grouped by date, ie. expecting result to be something like:

{
          "date" : 1,
          "field1-greatest" : 1,
          "field2-greatest" : 1,
          "field3-greatest" : 0
},
{
          "date" : 2,
          "field1-greatest" : 0,
          "field2-greatest" : 0,
          "field3-greatest" : 1
}

I'm using a terms aggregation on date, but not sure how to compare different fields to do this max and count type operation using Elasticsearch aggregations. Any suggestions?

Upvotes: 0

Views: 534

Answers (1)

Joe - Check out my books
Joe - Check out my books

Reputation: 16895

You could go with the following:

{
  "size": 0,
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "field1_greatest": {
          "max": {
            "field": "field1"
          }
        },
        "field2_greatest": {
          "max": {
            "field": "field2"
          }
        },
        "field3_greatest": {
          "max": {
            "field": "field3"
          }
        }
      }
    }
  }
}

Tip: make sure your field* properties are mapped as type double, not float because the max agg for field1, for example, may yield 0.8999999761581421 instead of 0.9.


CORRECTION

It's a non-trivial use case so you'll probably need to use a script. Here's something to get you started:

{
  "size": 0,
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "by_greatest": {
          "scripted_metric": {
            "init_script": """
              state.field1_greatest = 0;
              state.field2_greatest = 0;
              state.field3_greatest = 0;
            """,
            "map_script": """
              def v1 = doc['field1'].value;
              def v2 = doc['field2'].value;
              def v3 = doc['field3'].value;
              
              // your comparison logic
            """,
            "combine_script": "state",
            "reduce_script": "states"
          }
        }
      }
    }
  }
}

Upvotes: 0

Related Questions