Reputation: 2388
I have to query records from elastic search and display them in a grid with a page size of 1000. My index can contain above 1 million records.
I am no longer able to do paging with from + size queries because of the 10,000 limit on index.max_result_window. I don't want to increase this limit due to performance reasons.
I am using scroll api for paging and looping through records to show the desired page. e.g. if someone requests the 9th page I scroll 9 times.
It works well for intial pages but you can imagine going to the last page is very slow.
I am not an expert in elastic search, so any suggestions to improve this?
Thanks in advance
Edit 21/09/2016
Version: Elasticsearch 2.4.0
Scroll example: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
Scroll Size: 1000
So if I have 10,00,000 records to get to the last page I have to scroll 1000 times.
Plus the user has the option to go back as well which is not supported by scroll. So I have to start scrolling from the start.
I think there should be a better way of doing this.
Upvotes: 2
Views: 3801
Reputation: 534
I believe that the scroll query without the sort type set to "_doc" behaves in a way similar to increasing the max result window
size, since you still are returning results in order of the score you are still paying the cost of deep paging.
If you don't care about the order of the result set the sort to "_doc". See this. Although this would still not let you go back because that's just not how scroll works.
If you do want the documents in order of there score and also want to switch pages at any time there is no other way than to increase the max result window
size.
Increasing max result window
doesn't actually affect the performance in any other way until you actually start deep paging and there isn't really any other way to avoid that if you want pagination based on the score.
They only other thing you could do is in your application you make request for all the pages asynchronously and store the results and then fetch the results from there when the user actually requests them.
Upvotes: 1