Reputation: 451
ES is not mainstream for my work, and there's one behavior I'm not able to correct. I have a fairly simple aggregation query:
GET /my_index/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"request_type": "some_type"
}
},
{
"match": {
"carrier_name.keyword": "some_carrier"
}
}
]
}
},
"aggs": {
"by_date": {
"terms": {
"field": "date",
"order": {
"_term": "asc"
}
},
"aggs": {
"carrier_total": {
"sum": {
"field": "total_count"
}
}
}
}
}
}
My understanding from https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html is that not all documents are included in the aggregation. Indeed, depending on the query section, I do see in the results "sum_other_doc_count" : with values greater than zero.
My question: is there a way to construct the search so that all docs are included? The number of documents is fairly small, typically under 1k,
Thanks in advance, Reuven
Upvotes: 5
Views: 8469
Reputation: 16172
According to the documentaion,
size
defaults to 10
from
+size
can not be more than theindex.max_result_window
index setting, which defaults to 10,000.
In your case the documents are fairly small, nearly 1k, therefore 1k results can be easily retrieved.
The size parameter can be set to define how many term buckets should be returned out of the overall terms list. By default, the node coordinating the search process will request each shard to provide its own top size term buckets and once all shards respond, it will reduce the results to the final list that will then be returned to the client.
So a request is to be made to include top 1000 documents, in the field date.
...
"by_date": {
"terms": {
"field": "date",
"order": {
"_term": "asc"
},
"size": 1000
}
}
...
The higher the requested size is, the more accurate the results will be, but also, the more expensive it will be to compute the final results
To know more about this, you can refer this official doc
Upvotes: 9
Reputation: 16895
Increase the size
of the terms agg from the default 10
to a large-ish number:
...
"by_date": {
"terms": {
"field": "date",
"order": {
"_term": "asc"
},
"size": 1000 <-----
}
...
Upvotes: 0