f.trajkovski
f.trajkovski

Reputation: 825

In elastic search aggregate by one field then by two fields using date range

I have entity of this type saved to elasticSearch:

class Value {
  
  private Integer priority;
  private Date dueDate;

}

For priority possible values are 1 and 2(I know that enum should be used but let's not concentrate on that). Now using elastic search I want to get the following result:

  1. How many values I have with priority 1 that have dueDate today or in the past
  2. How many values I have with priority 1 that have dueDate null or in the future
  3. How many values I have with priority 2 that have dueDate today or in the past
  4. How many values I have with priority 2 that have dueDate null or in the future

What I tried doing is the following:

SearchRequest request = new SearchRequest.Builder()
                .index(MY_INDEX)
                .query(getQuery(someKeys))
                .aggregations(ELASTIC_AGG_PRIORITY, Aggregation.of(a ->
                        a.terms(terms -> terms.field(ELASTIC_AGG_PRIORITY).missing(""))))
                .build();

This will get me aggregations that are grouped by priority and will know the number of values with priority 1 and with priority 2. But I need to add nested aggregations and here is my issue. How to separate these buckets that I am getting into one with dueDate in past or today and with dueDate in future or null. Additionally I am having issue with writing the range query:

public static Query getRangeQuery(String field) {
    return QueryBuilders.range().field(field).lte(JsonData.of(new Date())).build()._toQuery();
}

I am not able to pass the year and I don't know how to handle the nulls when I need to select everything in the future and null values. I tried the following but not getting the result I need:

SearchRequest request = new SearchRequest.Builder()
                .index(MY_INDEX)
                .query(getQuery(someKeys))
                .aggregations(ELASTIC_AGG_ASSIGNEE, Aggregation.of(a ->
                        a.terms(terms -> terms.field(ELASTIC_AGG_ASSIGNEE).missing(""))))
                .aggregations(ELASTIC_AGG_PRIORITY, Aggregation.of(a ->
                                a.terms(terms -> terms.field(ELASTIC_AGG_PRIORITY).missing(""))
                                        .aggregations(ELASTIC_AGG_PRIORITY, Aggregation.of(filterAgg -> filterAgg
                                                .filter(QueryUtil.getRangeQuery("dueDate"))))
                        )
                ).build();

where QueryUtil.getRangeQuery( is the method described above.

So to summarise my question is how to get the 4 above mentioned results?

Upvotes: 1

Views: 246

Answers (1)

brugia
brugia

Reputation: 483

You're close but you need to implement the "date_range_aggregation" to get the results you are looking for.

Your terms aggregation is ok, all you need to do is add the date_range_aggregation as a subaggregation to the terms aggregation.

For your "null" values, set a "missing" value as something in the past.That way you can pick that up as a third bucket.

Sample[in json]:

{
   "aggs": {
       "range": {
           "date_range": {
               "field": "date",
               "missing": "1976/11/30",
               "ranges": [
                  { 
                    "key": "Oldest",
                    "to" : "1976/11/30"
                  },
                  {
                    "key": "Past",
                    "from" : "1976/11/30"
                    "to": "now"
                  }, 
                  {
                    "key": "Future",
                    "from": "now"
                  }
              ]
          }
      }
   }
}

Let me know if this works for you.

Upvotes: 1

Related Questions