Reputation: 327
I have an elasticsearch query like this, this query would yield around 25k results, so how can I divide the delivery of my result into chunks, say 5000 results per time so it would not hurt the server memory?
def get_data():
totals = 0
payload = {
"size": 50000,
"query": {
"filtered": {
"filter" : {
"bool": {
"must": [
{"term": {"events.id": "1"}},
{"range": {"score_content_0": {"gte": 60}} },
{"range": {"published_at": { "gte": "2016-12-19T00:00:00", "lte": "2017-04-19T23:59:59"}}},
{"term": {"lang": "en"}}
]
}
}
}
}
}
r = requests.post(RM_URL, json=payload)
results = json.loads(r.content, encoding='utf-8')
totals = results['hits']['total']
myhits = results['hits']['hits']
return myhits
Upvotes: 0
Views: 1481
Reputation: 2351
Unfortunately you can't get more than 10000 results at a time. And you can't even paginate past this point, so if you really want to get 25k results you would need to use scan API.
Just to clarify: I'm talking about elasticsearch 5.x
(and maybe 2.4
) in early versions it is possible
Upvotes: 2