Reputation: 172
I'm using elasticsearch with java api and I'm trying to get average value of lowest record from each bucket of term aggregation. One solution I found is to get results like this
AggregationBuilders.terms("group_by_flights").field("flight_id)
.subAggregation(AggregationBuilders.min("minimum").field("duration")))
and then count average on the code side. The problem is that if there will be lot of result, it will allocate a lot of memory to count it. I would like to do this on elastic side. I found, that there is something like avg bucket pipeline aggregation, which can be add as sibling aggregation to terms (and others)
"the average": {
"avg_bucket": {
"buckets_path": "some_bucket_path"
}
}
Problem is that in java api you can add pipeline aggregation only as subaggregation. So if we construct our aggregation like this our terms aggregation won't be seen
AggregationBuilders.terms("group_by_flights").field("flight_id")
.subAggregation(PipelineAggregatorBuilders.avgBucket("avg", "group_by_flights.duration" *<- this wont't be seen because its subaggregation*))
I was thinking about making some empty top aggregation and then add all aggregations as subaggregations, but it seems like silly walk-around, and I'm not understanding something correctly. Any ideas?
Upvotes: 1
Views: 2537
Reputation: 146
My solution is use FilterAggregationBuilder
to do it, this one can filtering data.The first sub aggregation to make data bucket, the second sub aggregation to merge bucket data.
AggregationBuilders.filter("global_aggregation", bool)
.subAggregation((AggregationBuilders.terms("group_by_flights").field("flight_id"))
.subAggregation(AggregationBuilders.min("min").field("duration")))
.subAggregation(PipelineAggregatorBuilders.avgBucket("avg_bucket_aggs", "group_by_flights>min"));
Upvotes: 1
Reputation: 172
The only solution I found so far is to make aggregations as sub aggregation of "empty aggregation"
AggregationBuilders.global("global_aggregation")
.subAggregation((AggregationBuilders.terms("group_by_flights").field("flight_id"))
.subAggregation(AggregationBuilders.min("min").field("duration")))
.subAggregation(PipelineAggregatorBuilders.avgBucket("avg_bucket_aggs","group_by_flights>min"))
Upvotes: 1