Reputation: 799
I am moving from Oracle to Postgresql. I am trying to convert some Oracle hierarchical queries to Postgres. For example, in Oracle to return a comma-delimited ordered list of all ids under (i.e., the children) and including the id_to_start_with
I would do the following:
SELECT LISTAGG(id_something, ',') WITHIN GROUP (ORDER BY id_something) AS somethings FROM(
SELECT DISTINCT D.id_something
FROM something_table D
START WITH D.id_something = :id_to_start_with
CONNECT BY D.id_something_parent = PRIOR D.id_something
)
The equivalent in Postgres would seem to be:
WITH RECURSIVE the_somethings(id_something) AS (
SELECT id_something
FROM something_table
WHERE id_something = $id_to_start_with
UNION ALL
SELECT D.id_something
FROM the_somethings DR
JOIN something_table D ON DR.id_something = D.id_something_parent
)
SELECT string_agg(temp_somethings.id_something::TEXT, ',') AS somethings
FROM (
SELECT id_something
FROM the_somethings
ORDER BY id_something
) AS temp_somethings
Likewise if I want to return a comma-delimited ordered list of all ids above (i.e., the parents) and including the id_to_start_with
I would do the following in Oracle:
SELECT LISTAGG(id_something, ',') WITHIN GROUP (ORDER BY id_something) AS somethings FROM(
SELECT DISTINCT D.id_something
FROM something_table D
START WITH D.id_something = :id_to_start_with
CONNECT BY D.id_something = PRIOR D.id_something_parent
)
The equivalent in Postgres would seem to be:
WITH RECURSIVE the_somethings(id_something, path) AS (
SELECT id_something
, id_something::TEXT as path
FROM something_table
WHERE id_something_parent IS NULL
UNION ALL
SELECT D.id_something
, (DR.path || ',' || D.id_something::TEXT) as path
FROM something_table D
JOIN the_somethings DR ON DR.id_something = D.id_something_parent
)
SELECT path
FROM the_somethings
WHERE id_something = $id_to_start_with
ORDER BY id_something
My question has to do with the last Postgres query. It seems terribly inefficient to me and I wonder if there is a better way to write it. That is, in Oracle the query will look for the parent of the id_to_start_with
, then the parent of the parent, and so forth to the root.
The Postgres query, on the other hand, gets every single root to child path combination possible and then throws everything away except for the one root to id_to_start_with
that I am looking for. That is potentially a ton of data to create just to throw it all away except for the one row I am looking for.
Is there a way to get a comma-delimited ordered list of all the parents of a particular id_to_start_with
that is as performant in Postgres as it is in Oracle?
Edit: Adding explain plans from Oracle and Postgres.
Postgres Explain Analyyze Output
CTE Scan on the_somethings (cost=62.27..74.66 rows=3 width=76) (actual time=0.361..0.572 rows=1 loops=1)
Filter: (id_something = 1047)
Rows Removed by Filter: 82
CTE the_somethings
-> Recursive Union (cost=0.00..62.27 rows=551 width=76) (actual time=0.026..0.433 rows=83 loops=1)
-> Seq Scan on something_table (cost=0.00..2.83 rows=1 width=8) (actual time=0.023..0.034 rows=1 loops=1)
Filter: (id_something_parent IS NULL)
Rows Removed by Filter: 82
-> Hash Join (cost=0.33..4.84 rows=55 width=76) (actual time=0.028..0.065 rows=16 loops=5)
Hash Cond: (d.id_something_parent = dr.id_something)
-> Seq Scan on something_table d (cost=0.00..2.83 rows=83 width=16) (actual time=0.002..0.012 rows=83 loops=5)
-> Hash (cost=0.20..0.20 rows=10 width=76) (actual time=0.009..0.009 rows=17 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> WorkTable Scan on the_somethings dr (cost=0.00..0.20 rows=10 width=76) (actual time=0.001..0.004 rows=17 loops=5)
Planning time: 0.407 ms
Execution time: 0.652 ms
This is the final query based on Jakub's answer below.
WITH RECURSIVE the_somethings(id_something, path, level, orig_id, id_something_parent) AS (
SELECT id_something
, id_something::TEXT as path
, 0 as level
, id_something AS orig_id
, id_something_parent
FROM something_table
WHERE id_something IN (1047, 448)
UNION ALL
SELECT D.id_something
, (D.id_something::TEXT || ',' || DR.path) as path
, DR.level + 1 as level
, DR.orig_id as orig_id
, D.id_something_parent
FROM something_table D
JOIN the_somethings DR ON D.id_something = DR.id_something_parent
)
SELECT DISTINCT ON(orig_id) orig_id, path
FROM the_somethings
ORDER BY orig_id, level DESC
;
Upvotes: 4
Views: 1560
Reputation: 16487
CTEs in PostgreSQL are fenced meaning they will be materialized and only then will the filter from outer query will be applied. To make the query perform correctly build it the other way around and put the filter inside the CTE.
WITH RECURSIVE the_somethings(id_something, path) AS (
SELECT id_something
, id_something::TEXT as path, 0 as level, id_something AS orig_id
FROM something_table
WHERE id_something IN ($id_to_start_with,$id_to_start_with2)
UNION ALL
SELECT D.id_something
, (D.id_something::TEXT || ',' || DR.path) as path, DR.level + 1, DR.orig_id
FROM something_table D
JOIN the_somethings DR ON DR.id_something_parent = D.id_something
)
SELECT DISTINCT ON(orig_id) orig_id, path
FROM the_somethings
ORDER BY orig_id, DR.level DESC
Upvotes: 1