Paginating very large datasets

Question

I have a dataset in MySQL where using limit is already an expensive query, and finding the number of results is expensive as well. Thus, I'd like to avoid doing another query to find the number of results. I cannot use MYSQL_CALC_FOUND_ROWS because the limit is inside a subquery:

SELECT * FROM items,
(
  SELECT
    item_id
  FROM
    (etc)
  WHERE
    some.field=
  AND (etc)
  GROUP BY (something)
  ORDER BY (something_else) DESC
  LIMIT 15
) subset
WHERE item.id=subset.item_id

I could left join items and do away with the subquery, then be able to use MYSQL_CALC_FOUND_ROWS, however this is very, very, slow. I've tried all index optimizations and let's just assume it is out of the question.

This now becomes more a design question... how do I allow the user to page through this data when I don't know the last page? I only know if they've gone too far (eg: query returns no results).

Bill Karwin · Accepted Answer

Here's a summary of an article by MySQL guru Baron Schwartz:

http://www.mysqlperformanceblog.com/2008/09/24/four-ways-to-optimize-paginated-displays/

Four Ways to Optimize Paginated Displays

On the first query, fetch and cache all the results.
Don't show all results. Not even Google lets you see the millionth result.
Don't show the total count or the intermediate links to other pages. Show only the "next" link.
Estimate how many results there are. Again, Google does this and nobody complains.

Paginating very large datasets

Answers (2)

Four Ways to Optimize Paginated Displays

Related Questions