Reputation: 3035
I need to do an aggregation + sorting + pagination in one of the indexes.
I learned about internal functionality of Elastic search,
I have 5 total shards, it will sort the individual shards and fetch the result, by default each shard will return in 10 records. Then the 50 records are sorted again and it will fetch the top 10 record since by default size is 10.
ouput:
The aggregated results are returned in separate field named as "aggregations".In order to do pagination in this aggregated data,size and from are not working.
So tired of doing termBuilder.size(500), now the logic was differs as per this link (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html)
It leads to inaccuracy of data.
Can any one suggest me how to deal with aggregation + pagination.
Upvotes: 37
Views: 63629
Reputation: 323
Yes it is possible pagination + sorting + searching elasticsearch Open link. Elasticsearch supports Bucket Sort Aggregation in in v6.X and later. This bucket_sort uses all records in terms/date_histogram bucket and apply over that. So for this case we have to keep bucket size big enough or more than bucket records so that it keep all possible records in bucket. Example as follows...
{
"aggs": {
"aggs1": {
"terms": {
"field": "field_name.keyword",
// We can do sort here also
"size": 1000000 // Keep this size big integer. This keep all possible result in bucket
},
"aggs": {
"bucket_sort": {
"bucket_sort": {
"sort": [{
"_key": {
"order": "asc"
}
}],
// This "from" and "size" use above terms bucket size. it applies over available bucket data [This one give actual result]
// Bellow is standard pagination as we do
"from": 0,
"size": 10
}
}
}
}
},
"size": 0
}
Upvotes: 17
Reputation: 107
If someone also struggle with the same problem here is a PHP and Elastica (http://elastica.io/) solution that works for me.
function addAggregationFields($oAgg){
$oAggField = new Stats('costs');
$oAggField->setField('costs');
$oAgg->addAggregation($oAggField);
return $oAgg;
}
function addAggregationFilters($oAggFilter){
$oFilters = new \Elastica\Query\Terms();
$oFilters->setTerms("user_id", [3,7]);
$oAggFilter->setFilter($oFilters);
return $oAggFilter;
}
$iItemsInPage = 100;
$iPage = 0;
$sGoupBy = "created_date";
$oStore = new Store();
$oStore->setConfiguration(new SearchConfiguration());
$oIndex = $oStore->getIndex("report_*");
$oAggFilter = new Filter('cardinality');
$oAggFilter = addAggregationFilters($oAggFilter);
$oAgg = new Cardinality('cardinality');
$oAgg->setField($sGoupBy);
$oAggFilter->addAggregation($oAgg);
$oCardinalityQuery = new Query();
$oCardinalityQuery->setSize(0);
$oCardinalityQuery->addAggregation($oAggFilter);
$resultSet = $oIndex->search($oCardinalityQuery)->getAggregations();
if(isset($resultSet['cardinality'])) {
$iCardinality = $resultSet['cardinality']['cardinality']['value'];
if(0 != $resultSet['cardinality']['cardinality']['value']) {
$iPages = ceil($iCardinality / $iItemsInPage);
} else {
$iPages = 1;
}
}
$oAggFilter = new Filter('aggregation_result');
$oAggFilter = addAggregationFilters($oAggFilter);
$oAgg = new \Elastica\Aggregation\Terms('terms');
$oAgg->setField($sGoupBy);
$oAgg->setParam("include", array("partition" => $iPage, "num_partitions" => $iPages));
$oAgg->setOrder('costs.sum', 'desc');
$oAgg->setSize($iItemsInPage);
$oAgg = $this->addAggregationFields($oAgg);
$oAggFilter->addAggregation($oAgg);
$oQuery = new Query();
$oQuery->addAggregation($oAggFilter);
$oQuery->setSize(0);
$resultSet = $oIndex->search($oQuery)->getAggregations();
The Process is described here https://stackoverflow.com/a/54351245/2923963
Upvotes: -1
Reputation: 2039
ElasticSearch supports Bucket Sort Aggregation
in v6.1 and later. It allows "sort", "size" and "from" parameters within aggregated results.
Please refer to this doc
Upvotes: 12
Reputation: 1020
I think Composite Aggregation
might solve your problem, as it allows pagination within aggregated results.
Please refer to this doc
Upvotes: 14
Reputation: 1
You can use a work around. Suppose you want to show 10 records per page in ascending order of a field f1, then store the last value of that field for each page(10th, 20th ...) and use greater than and sort in the search query.
Upvotes: 0
Reputation: 631
In elasticsearch, there is no accurate solution for it. You may use filtering with partition options but applied partitioner can break your sorted result. ES performs partition operation over a given field and returns buckets from requested partition. So your result end up with partition ordered.(you need to make subsequent request with other partition number to gather data from all partitions.)
My suggestion is give a higher size value for each term as you mentioned in your question.
Upvotes: 0
Reputation: 1469
Paging aggregation results is supported using partition
. This section in the official docs is very helpful.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_filtering_values_with_partitions
{
"size": 0,
"aggs": {
"expired_sessions": {
"terms": {
"field": "account_id",
"include": {
"partition": 0,
"num_partitions": 20
},
"size": 10000,
"order": {
"last_access": "asc"
}
},
"aggs": {
"last_access": {
"max": {
"field": "access_date"
}
}
}
}
}
}
Upvotes: 2
Reputation: 458
In elasticsearch, it's not possible to paginate an aggregation. The query will not give accurate results if size is specified. So, the only way to do sorting and pagination is to give size 0 and return all the documents and then, get the required results by accumulating all the results in a list for further operation.
Upvotes: 17