Tom
Tom

Reputation: 7992

How should I paginate a SQL query where the rows have no natural ordering?

I've inherited a code base with an associated database schema.

The schema has numeric, auto-incrementing primary keys. It does pagination with queries like this:

WITH params AS 
(
    SELECT id 
    FROM table AS first_id 
    WHERE uuid = "{last_uuid}"
)
SELECT *
FROM table
WHERE id > params.first_id
ORDER BY id
LIMIT 100

or words to that general effect.

I've added some new tables that use the UUID as their primary key rather than having separate "public" UUID keys and internal integer IDs. But now I'm wondering how to paginate them.

I could do it like this:

SELECT *
FROM table
ORDER BY uuid
    OFFSET n ROWS
    FETCH NEXT 100 ROWS ONLY

Someone has objected that this has performance implications. My brief investigation suggests that this is true; the time for the first method is constant (about 7 ms in my example) while the time for the second method appears to be O(n) in the number of rows offset - an offset of 1.45 million rows takes about 200 ms. I suspect that the rows are all still cached in RAM at that point. As the table gets bigger, performance starts to decline fairly rapidly; with 1.85 million rows, the query takes nearly 600 ms.

Now, at some level, I'm not sure I care about that kind of performance. The chances that I'll have to paginate 1+ million rows of data in my current application are slim.

But is there a good way to paginate data like this that doesn't have a natural ordering?

Database is MariaDB if that makes a big difference, though I'd generally be interested in answers for most major RDBMSs.

Upvotes: 1

Views: 54

Answers (2)

Rick James
Rick James

Reputation: 142473

  • The PRIMARY KEY is the most efficient ordering for SELECT * in Engine=InnoDB. This statement is valid regardless of whether the PK is a UUID or auto_inc or something else.

  • "Remembering where you left off" is the most efficient way to get to the 'next' page. See Pagination

  • OFFSET can lead to hiccups with duplicated/missing rows when moving from one page to the next.

  • UUID keys are inherently inefficient because they scatter the data around the table. See UUIDs

  • You should be using InnoDB.

  • Using no index will be terrible for your task.

  • WHERE uuid > some_uuid ORDER BY uuid begs for INDEX(uuid) or, even better, PRIMARY KEY(uuid).

  • A binary UUID (see link) will save 36-16 bytes per table and per index containing the UUID. (That may not matter for only a few million rows.)

  • Regardless, you must have an ordering to the rows.

Upvotes: 1

nik0x1
nik0x1

Reputation: 1461

Try to abandon offsets and switch to cursor base pagination.

Conceptually:

First page.

SELECT *
FROM table
ORDER BY uuid
LIMIT 100

Next page.

SELECT *
FROM table
WHERE uuid > some_uuid
ORDER BY uuid
LIMIT 100

Previous page.

SELECT *
FROM table
WHERE uuid < some_uuid
ORDER BY uuid DESC
LIMIT 100

P.S. Also check that you have an index on the uuid field and that you are using InnoDB engine.

Upvotes: 0

Related Questions