Reputation: 8674
For a large data set, offset
based pagination becomes slow, and so a much quicker way is to use cursor based pagination. Basically, an anchor point, where the database knows to lookup results from that point onward. With that in mind, here is the issue I am facing:
I have a tabletv_watchers
with an autoincrementing id
, mins_watching_tv
, and user_id
(20 rows total fiddle below). In this example user_id
will be the same 1
, so no need to worry about it. We want to sort by the number of minutes spent watching tv, from highest to lowest.
This is easily done using this query:
SELECT * FROM tv_watchers
ORDER BY mins_watching_tv DESC, id ASC
This will return the correct order of the 20 fields ordered as we want it in this fashion by id:
2, 17, 1, 16, 15, 5, 6, 7, 8, 9, 10, 11, 12, 13, 20, 3, 4, 14, 19, 18
The issue is we want to split it up into chunks (we call them batches) of 5 since we want to return 5 results in the order above. We do this by retrieving the first 6 results, returning the first 5 to the user, and using the 6th if it exists as the cursor (anchor point) to get the next batch from: This returns the first batch correctly:
-- (Batch 1) 2, 17, 1, 16, 15, 5
SELECT * FROM tv_watchers
ORDER BY mins_watching_tv DESC, id ASC
LIMIT 6
The 6th item here is id 5
which has a mins_watching_tv
of 60
, so since this is the cursor we use it to get the next 6 like this:
-- (Batch 2) 5, 6, 7, 8, 9, 10
SELECT * FROM tv_watchers
WHERE mins_watching_tv <= 60 OR id=5
ORDER BY mins_watching_tv DESC, id ASC
LIMIT 6
The 6th item here is id 10
which also has a mins_watching_tv
of 60
, so since this is the cursor we use it to get the next 6 like this:
-- (Batch 3 should be) 10, 11, 12, 13, 20, 3
-- (Batch 3 returns incorrectly) 5, 6, 7, 8, 9, 10
SELECT * FROM tv_watchers
WHERE mins_watching_tv <= 60 OR id=10
ORDER BY mins_watching_tv DESC, id ASC
LIMIT 6
But the issue is the results that come back are not correct, it returns the incorrect batch 3 ids seen in the comment above. I am sure it has to do with the WHERE
portion, it seems to pick up the mins_watching_tv <= 60
part but the id=10
part is there to let the database know to pick up results from that anchor point of 60 minutes and id 10, but that does not work correctly.
The final batch results should look like this:
-- (Batch 4) 3, 4, 14, 19, 18
I set up an sql fiddle here to show the issue. How can we fix the query so that it respects the cursor combination of mins_watching_tv
in combination with the id
to return the correct results in batches?
Upvotes: 5
Views: 2893
Reputation: 37472
Select your first 6 like you already did, without anything in the WHERE
.
SELECT *
FROM tv_watchers
ORDER BY mins_watching_tv DESC,
id ASC
LIMIT 6;
Duration @duration
and the ID @id
of the last row of the result from the previous step and put them into the WHERE
like
SELECT *
FROM tv_watchers
WHERE mins_watching_tv < @duration
OR mins_watching_tv = @duration
AND id >= @id
ORDER BY mins_watching_tv DESC,
id ASC
LIMIT 6;
Repeat 2. until end is reached.
Explanation:
mins_watching_tv < @duration
we can be certain, that the respective row isn't in our previous result as mins_watching_tv
is less then the minimum @duration
from our previous result and we did a ORDER BY mins_watching_tv DESC
.mins_watching_tv = @duration
we don't know yet if we already had the row. But as we additionally did an ORDER BY id ASC
, we know that all rows we already had with the same mins_watching_tv
have an id less than or equal the current maximum @id
(per mins_watching_tv
). So we only want those rows where id > @id
or, as we also want the last row of the previous result repeated, id = @id
. In short that's id >= @id
.As we want the union of both of those sets, we have to disjunct the above predicates, so use an OR
. We get (parenthesis only for clarity, they're not needed):
(mins_watching_tv < @duration)
OR (mins_watching_tv = @duration
AND id >= @id)
And here is the fiddle.
Upvotes: 4
Reputation: 15941
I only skimmed but I think you just need to tweak your conditions to (for example)
mins_watching_tv < 60 OR (mins_watching_tv = 60 AND id>=5)
Upvotes: 2