Thành Đạt
Thành Đạt

Reputation: 327

Divide elasticsearch query result into chunks?

I have an elasticsearch query like this, this query would yield around 25k results, so how can I divide the delivery of my result into chunks, say 5000 results per time so it would not hurt the server memory?

def get_data():
    totals = 0
    payload = { 

      "size": 50000,
          "query": {
              "filtered": {  
                "filter" : {
                  "bool": {
                    "must": [
                      {"term": {"events.id": "1"}},
                      {"range": {"score_content_0": {"gte": 60}} },
                      {"range": {"published_at": { "gte": "2016-12-19T00:00:00", "lte": "2017-04-19T23:59:59"}}},
                      {"term": {"lang": "en"}}

            ]
          }
         }
        }
        }  
    }

    r = requests.post(RM_URL, json=payload)
    results = json.loads(r.content, encoding='utf-8')
    totals = results['hits']['total']
    myhits = results['hits']['hits']
    return myhits

Upvotes: 0

Views: 1481

Answers (1)

Darth Kotik
Darth Kotik

Reputation: 2351

Unfortunately you can't get more than 10000 results at a time. And you can't even paginate past this point, so if you really want to get 25k results you would need to use scan API.

Just to clarify: I'm talking about elasticsearch 5.x (and maybe 2.4) in early versions it is possible

Upvotes: 2

Related Questions