Slava Nikulin
Slava Nikulin

Reputation: 505

Postgresql returns random rows when using large OFFSET

I'm curious of this behaviour:

test_db=# create table test_table(id bigint);
test_db=# insert into test_table(id) select * from generate_series(1, 1000000);
test_db=# select * from test_table offset 100000 limit 1;

   id
-------
 87169
(1 row)

test_db=# select * from test_table offset 100000 limit 1;

   id
--------
 186785
 (1 row)

test_db=# select * from test_table offset 100000 limit 1;

  id
--------
 284417
(1 row)

Seems that postgres iterates forward with some randomizing rule. Why does large offset "mix" table? After that if we use small offset it returns "stable" value:

test_db=# select * from test_table offset 1 limit 1;
    id
  --------
   282050
  (1 row)

test_db=# select * from test_table offset 1 limit 1;
     id
  --------
   282050
   (1 row)

Upvotes: 4

Views: 546

Answers (2)

jjanes
jjanes

Reputation: 44305

PostgreSQL has a feature that tries to get multiple concurrent sequential scans on the same large table to all work on the same part of the table at the same time, so that they can share cache space and don't have to each read the same data off disk individually. A side effect of this is that for partial (like with LIMIT) sequential scans done consecutively, each one starts where the previous one left off.

The synchronization points are always at page boundaries, so with a low OFFSET and a low LIMIT you just keep reading data from the same page (and from that page's beginning) over and over again and getting the same data.

You can turn this off with set synchronize_seqscans TO off; if you need to get more stable results for some internal testing purpose. If you do this you are, as von Neumann might say, living in a state of sin.

Upvotes: 2

Jim Jones
Jim Jones

Reputation: 19653

As the table records are not physically ordered, it is imperative to use ORDER BY in a OFFSET .. LIMIT query. Otherwise you might get random results:

SELECT * 
FROM test_table 
ORDER BY id 
OFFSET 100000 
LIMIT 1;        

Demo: db<>fiddle

Upvotes: 3

Related Questions