Jin Kwon
Jin Kwon

Reputation: 22017

Pagination getting slower while page number increasing

I have a table look like this.

parent   VARCHAR PK
priority INT     PK
child    VARCHAR

Now I need to SELECT/COUNT, for Spring pagination, those who has children more than specified number.

Here comes my queries.

-- SELECT
SELECT
  parent,
  GROUP_CONCAT(child ORDER BY priority SEPARATOR 0x1D) AS concatenated_children,
  COUNT(child) AS child_count
FROM mytable
GROUP BY parent
HAVING child_count >= ?1
ORDER BY parent ASC
LIMIT 2048,?   -- GETTING SLOWER when the OFFSET is increases
-- COUNT
SELECT COUNT(wrk.parent)
FROM (SELECT parent, COUNT(child) AS child_count
      FROM mytable
      GROUP BY parent
      HAVING child_count >= ?1) AS wrk

The ?1 part is a constant(8) and the the offset part(,?) is increases per pagination.

The problem is that the selection query is getting slower while the page number increases.

?size=2048,page=0   took 60 ms
?size=2048,page=100 took 2646 ms

How can I fix this?

CREATE TABLE `mytable` (
  `parent` varchar(100) NOT NULL,
  `child` varchar(255) NOT NULL,
  `priority` int(11) NOT NULL,
  PRIMARY KEY (`parent`,`priority`)
)
parent child           priority
---------------------------------
NIKE   Air Max         1
NIKE   Air Jordan      2
WATCH  Patek Philippe  1
WATCH  Rolex           2

Criteria

Select all parents (and its children)
which each has at least some number children, say 8.

Upvotes: 6

Views: 1857

Answers (2)

Alex Veleshko
Alex Veleshko

Reputation: 1242

The slowdown happens because of the way OFFSET works: it fetches all the data and only then drops the part before the offset. For your case it means the grouping will happen not only for the current page, but for all the previous pages too.

The standard trick to fix this kind of problem is to use Keyset Pagination. When fetching the page, you need to remember its last parent. Then in order to fetch the next page, you use your query with the

WHERE parent > YOUR_LAST_PARENT

clause and no OFFSET. The RDBMS will see that parent is indexed and quickly navigate the index to the correct parent.

Full query:

  SELECT parent,
         GROUP_CONCAT(child ORDER BY priority
                            SEPARATOR 0x1D) AS concatenated_children,
         COUNT(child) AS child_count
    FROM mytable
   WHERE parent > ?1
GROUP BY parent
  HAVING child_count >= ?2
ORDER BY parent
   LIMIT 2048

When querying for the first page, you'll need to run this query without the WHERE clause.

Your URL parameters will now look like

?size=2048,parent=PARENTXXX

Of course, you'll have to give up on Spring Data JPA built-in pagination. It's also not immediately possible to jump to a particular page, you're basically limited to the next/previous pages (as long as you remember the previous page parent).

By the way, when experimenting with this I observed that the major problem with the Spring Data JPA pagination is that it has to COUNT all the rows. For InnoDB that alone takes more time than fetching the page.

EDIT. The first version of the answer recommended adding the condition on parent to the HAVING clause. As mentioned in the comments below it won't work (I tested that it doesn't work).

Upvotes: 2

Rick James
Rick James

Reputation: 142443

As already mentioned, OFFSET must get and toss all the rows before the desired ones.

If you can code the query to "remember where you left off", there is a much faster way than using OFFSET.

Here is a MySQL-specific discussion of such: http://mysql.rjweb.org/doc.php/pagination

Upvotes: 4

Related Questions