Sam
Sam

Reputation: 6327

Paginating very large datasets

I have a dataset in MySQL where using limit is already an expensive query, and finding the number of results is expensive as well. Thus, I'd like to avoid doing another query to find the number of results. I cannot use MYSQL_CALC_FOUND_ROWS because the limit is inside a subquery:

SELECT * FROM items,
(
  SELECT
    item_id
  FROM
    (etc)
  WHERE
    some.field=<parameter>
  AND (etc)
  GROUP BY (something)
  ORDER BY (something_else) DESC
  LIMIT 15
) subset
WHERE item.id=subset.item_id

I could left join items and do away with the subquery, then be able to use MYSQL_CALC_FOUND_ROWS, however this is very, very, slow. I've tried all index optimizations and let's just assume it is out of the question.

This now becomes more a design question... how do I allow the user to page through this data when I don't know the last page? I only know if they've gone too far (eg: query returns no results).

Upvotes: 3

Views: 4887

Answers (2)

dkretz
dkretz

Reputation: 37645

To reinforce that, the concept of "how many" is extremely transitory anyway; by the time the answer gets to the user, it could easily be wrong.

Upvotes: 0

Bill Karwin
Bill Karwin

Reputation: 562250

Here's a summary of an article by MySQL guru Baron Schwartz:

http://www.mysqlperformanceblog.com/2008/09/24/four-ways-to-optimize-paginated-displays/

Four Ways to Optimize Paginated Displays

  1. On the first query, fetch and cache all the results.

  2. Don't show all results. Not even Google lets you see the millionth result.

  3. Don't show the total count or the intermediate links to other pages. Show only the "next" link.

  4. Estimate how many results there are. Again, Google does this and nobody complains.

Upvotes: 6

Related Questions