user8870331
user8870331

Reputation:

Does PostgreSQL share the ordering of a CTE?

In PostgreSQL, common table expressions (CTE) are optimization fences. This means that the CTE is materialized into memory and that predicates from another query will never be pushed down into the CTE.

Now I am wondering if other metadata about the CTE, such as ordering, is shared to the other queries. Let's take the following query:

WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC LIMIT 10
)
SELECT MIN(type) FROM ordered_objects

Here, MIN(type) is obviously always the first row of ordered_objects (or NULL if ordered_objects is empty), because ordered_objects is already ordered by type. Is this knowledge about ordered_objects available when evaluating SELECT MIN(type) FROM ordered_objects?

Upvotes: 11

Views: 3305

Answers (2)

joop
joop

Reputation: 4503

  • [in Postgres] a CTE is always excuted once
  • , even if it referenced more than once
  • its result is stored into a temporary table (materialized)
  • the outer query has no knowledge about the internal structure (indexes are not available) or ordering (not sure about frequency estimates), it just scans the temp results
  • in the below fragment the CTE is scanned twice, even if the results are known to be identical.

\d react

EXPLAIN ANALYZE
WITH omg AS (
        SELECT topic_id
        , row_number() OVER (PARTITION by krant_id ORDER BY topic_id) AS rn
        FROM react
        WHERE krant_id = 1
        AND topic_id < 5000000
        ORDER BY topic_id ASC
        )
SELECT MIN (o2.topic_id)
FROM omg o1                   --
JOIN omg o2 ON o1.rn = o2.rn  -- exactly the same
WHERE o1.rn = 1
        ;

                    Table "public.react"
   Column   |           Type           |     Modifiers      
------------+--------------------------+--------------------
 krant_id   | integer                  | not null default 1
 topic_id   | integer                  | not null
 react_id   | integer                  | not null
 react_date | timestamp with time zone | 
 react_nick | character varying(1000)  | 
 react_body | character varying(4000)  | 
 zoek       | tsvector                 | 
Indexes:
    "react_pkey" PRIMARY KEY, btree (krant_id, topic_id, react_id)
    "react_krant_id_react_nick_react_date_topic_id_react_id_idx" UNIQUE, btree (krant_id, react_nick, react_date, topic_id, react_id)
    "react_date" btree (krant_id, topic_id, react_date)
    "react_nick" btree (krant_id, topic_id, react_nick)
    "react_zoek" gin (zoek)
Triggers:
    tr_upd_zzoek_i BEFORE INSERT ON react FOR EACH ROW EXECUTE PROCEDURE tf_upd_zzoek()
    tr_upd_zzoek_u BEFORE UPDATE ON react FOR EACH ROW WHEN (new.react_body::text <> old.react_body::text) EXECUTE PROCEDURE tf_upd_zzoek()

----------

 Aggregate  (cost=232824.29..232824.29 rows=1 width=4) (actual time=1773.643..1773.645 rows=1 loops=1)
   CTE omg
     ->  WindowAgg  (cost=0.43..123557.17 rows=402521 width=8) (actual time=0.217..1246.577 rows=230822 loops=1)
           ->  Index Only Scan using react_pkey on react  (cost=0.43..117519.35 rows=402521 width=8) (actual time=0.161..419.916 rows=230822 loops=1)
                 Index Cond: ((krant_id = 1) AND (topic_id < 5000000))
                 Heap Fetches: 442
   ->  Nested Loop  (cost=0.00..99136.69 rows=4052169 width=4) (actual time=0.264..1773.624 rows=1 loops=1)
         ->  CTE Scan on omg o1  (cost=0.00..9056.72 rows=2013 width=8) (actual time=0.249..59.252 rows=1 loops=1)
               Filter: (rn = 1)
               Rows Removed by Filter: 230821
         ->  CTE Scan on omg o2  (cost=0.00..9056.72 rows=2013 width=12) (actual time=0.003..1714.355 rows=1 loops=1)
               Filter: (rn = 1)
               Rows Removed by Filter: 230821
 Total runtime: 1782.887 ms
(14 rows)

Upvotes: 3

Vao Tsun
Vao Tsun

Reputation: 51456

If I understand your question correctly - no, it does not. no such knowledge does. As you will find in example below. when you limit to 10 rows, execution is extremely faster - less data to process (in my case million times less), which would mean CTE scans whole ordered set ignoring the fact, that min would be in first rows...

data:

t=# create table object (type bigint);
CREATE TABLE
Time: 4.636 ms
t=# insert into object select generate_series(1,9999999);
INSERT 0 9999999
Time: 7769.275 ms

with limit:

explain analyze WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC LIMIT 10
)
SELECT MIN(type) FROM ordered_objects;

Execution time: 3150.183 ms

https://explain.depesz.com/s/5yXe

without:

explain analyze WITH ordered_objects AS
(
    SELECT * FROM object ORDER BY type ASC
)
SELECT MIN(type) FROM ordered_objects;

Execution time: 16032.989 ms

https://explain.depesz.com/s/1SU

I surely warmed up data before tests

Upvotes: 5

Related Questions