Jogi Michał
Jogi Michał

Reputation: 73

ElasticSearch sliced scroll limit (python)

I'm working with a huge (5 million documents) ElasticSearch database and I need to fetch data using sliced scroll in python. Question is: if there is some way to limit (set size param) the sliced scroll? I tried to set size param by [search obj].param(size=500000) or [:500000] but it doesn't seem to work - sliced scroll gives me all documents.

In my script, I'm using sliced scroll with python multiprocessing like in here: https://github.com/elastic/elasticsearch-dsl-py/issues/817

Is there some way to get for example 500000 documents using sliced scroll?

Thanks in advance.

Upvotes: 0

Views: 1849

Answers (1)

Jogi Michał
Jogi Michał

Reputation: 73

Answer from github:

"There is no limit on scroll, it always returns all documents. To only get a subset simply stop consuming the iterator after you get the number you wanted to retrieve by using a break statement or similar."

https://github.com/elastic/elasticsearch-dsl-py/issues/817

Upvotes: 1

Related Questions