Reputation: 3537
I'm trying to perform an Elasticsearch query and would like for Elasticsearch to group the results for me, instead of having my client code do it manually. Looking at the Elasticsearch documentation, it appears like bucketing aggregation would be what I'm looking for, but I can't find any examples that use it, or what the output would look like to be sure that's what I want.
My question is: is it possible to group documents by a key in Elasticsearch? If so, how and where can I find documentation on how to do it, either using the query DSL or (preferably) the Javadoc for the Java API?
Upvotes: 6
Views: 10114
Reputation: 1
Grouping With Aggregation Function
client.prepareSearch(indexName).setTypes(documentType).addAggregation
(AggregationBuilders.terms("agg_name").field("gender").subAggregation(
AggregationBuilders.topHits("documents").setSize(10))).execute(new ActionListener<SearchResponse>()
{
public void onResponse(SearchResponse response)
{
Terms agg_name_aggregation=response.getAggregations().get("agg_name");
for (Terms.Bucket bucket : agg_name_aggregation.getBuckets())
{
TopHits topHits=bucket.getAggregations().get("documents");
SearchResponse response1 = client
.prepareSearch(indexName)
.setQuery(QueryBuilders.termQuery("gender",bucket.getKey()))
.addAggregation(
AggregationBuilders.max("salary").field(
"salary")).execute().actionGet();
for (Aggregation avgAggs : response1.getAggregations())
{
Max avg = (Max) avgAggs;
double maxValue = avg.getValue();
System.out.println("Avg Value => " + maxValue);
}
// System.out.println("term = " + bucket.getKey());
// System.out.println("count =" + bucket.getDocCount());
// System.out.println(topHits.getHits());
for (SearchHit hit: topHits.getHits())
{
System.out.println(hit.getSource());
}
}
}
public void onFailure(Throwable e) {
e.printStackTrace();
}
});
}
Upvotes: 0
Reputation: 4489
I guess you are trying to group by a field in elasticsearch, you can do it by using Terms aggregation.
Here is how to do using query dsl,
POST _search
{
"aggs": {
"genders": {
"terms": {
"field": "gender"
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"_source": {
"include": [
"include_fields_name"
]
},
"size": 100
}
}
}
}
}
}
and gender is field in document, Its response can be
{
...
"aggregations" : {
"genders" : {
"buckets" : [
{
"key" : "male",
"doc_count" : 10,
"tag_top_hits":{"hits":...}
},
{
"key" : "female",
"doc_count" : 10,
"tag_top_hits":{"hits":...}
},
]
}
}
}
Using Java api, I've added tophits aggregation for your comment. (but not in query dsl)
client.prepareSearch("index").setTypes("types").addAggregation(
AggregationBuilders.terms("agg_name").field("gender").subAggregation(
AggregationBuilders.topHits("documents").setSize(10)
)
).execute(new ActionListener<SearchResponse>() {
@Override
public void onResponse(SearchResponse response) {
Terms agg_name_aggregation=response.getAggregations().get("agg_name");
for (Terms.Bucket bucket : agg_name_aggregation.getBuckets()) {
TopHits topHits=bucket.getAggregations().get("documents");
System.out.println("term = " + bucket.getKey());
// do what you want with top hits..
}
}
@Override
public void onFailure(Throwable e) {
e.printStackTrace();
}
});
Hope this helps!!
Upvotes: 11