Reputation: 11
I need to paginate more than 30M users which are hosted on MySQL. I'm displaying 15 users per page, but it's quite slow. My goal is to access to any random page and load it in a few ms.
At the beginning, I was using the offset method for MySQL, but as I said, is quite slow (and a bad idea for sure). Then I moved to ElasticSearch, but you still have some window limit, so you are limited. After that, I have been checking different ways like the "cursor" method, but I can not access to any random page. For example, we start at the first page, and we have 100000 pages, I would like to access the 4782th page, and load it in a few ms. With the cursor method, I'm just able to access the next && prev page, and the "scroll" method doesn't fit what I really need.
My users' ID are not sorted just by ID, so I can not use it as a delimiter. Already thought about Late row lookups
I don't mind to move all my data to a new DB (but would be to find different solutions). Here Amazon does it really well (https://www.amazon.com/review/top-reviewers)
Query using offset:
SELECT users.* from users
WHERE users.country = 'DE'
ORDER BY users.posts_count DESC, users.id DESC
LIMIT 15 OFFSET 473
PD: My user list is almost in real-time, so it's changing every hour.
Any ideas? Thanks a lot!
Upvotes: 1
Views: 1262
Reputation: 142298
"access the 4782th page" -- What is the use case for this? "Pagination" is useful for a few pages, maybe a few dozen pages, but not thousands.
[Next], [Prev], [First], [Last] are useful. But if you want a random probe, then call it a [Random] probe, not "page 4782".
OFFSET
is inefficient. Here is a discussion of an alternative: http://mysql.rjweb.org/doc.php/pagination
Meanwhile add INDEX(country, posts_count, id)
Upvotes: 1
Reputation: 217304
One way to achieve this with Elasticsearch is to add a linearly increasing field (e.g. sort_field
) to each of your records (or use your ID field if it's linearly increasing). The first record's field has value 1, the second 2, the third 3, etc...
Then, if you sort by that field in ascending mode, you can use the search_after
feature in order to access any record directly.
For instance, if you need to access the 4782th page (i.e. record 71730 and following), you can achieve it like this:
POST my-index/_search
{
"size": 15, <--- the page size
"sort": [
{
"sort_field": "asc" <--- properly ordering the records
}
],
"search_after": [ 71730 ] <--- direct access to the desired record/page
}
Under certain circumstances, it is also possible to make the sorting even faster by leveraging the index sorting capability.
Note: deep pagination is not something Elasticsearch has been built for. The solution above works, but can have some shortcomings (see comments) depending on your context. It might not be the best available technology for what you need to do.
Upvotes: 0