Reputation: 1017
I use the following simple query to search across documents in my Elastic index:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 }
}
}
}
This returns me the number of documents per distinct value of myField.raw
.
Since I'm interested into all actual documents than the total number, I tried to add the following top_hits
sub aggregation:
{
"query": { "query_string": { "query": "*test*" } },
"aggregations": {
"myaggregation": {
"terms": { "field": "myField.raw", "size": 0 },
"aggregations": {
"hits": {
"top_hits": { "size": 2000000 }
}
}
}
}
}
This ugly usage of top_hits
works, but is slow as hell.
Is there any proper way to fetch the actual documents for each bucket after doing the term
aggregation?
Upvotes: 11
Views: 11372
Reputation: 344
Have you considered using collapse
on field
?
It returns doc grouped under inner_hits (hits.hits[].inner_hits.<collapse-group-name>.hits.hits[]._source
)
Refer - https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-collapse.html
Upvotes: 2