Reputation: 222
I'm looking for feedback and advice for using CONNECT BY in a hierarchical query.
Suppose I have a table that looks like this:
id | alt_id | parent_id | parent_ext_id |
---|---|---|---|
A | A1 | null | null |
B | B1 | A | null |
C | C1 | null | A1L |
D | null | A | null |
E | E1 | null | B1L |
F | null | B | null |
The "id" column is the PK and never null. "alt_id" is an "alternate id", unique and can be null. Sometimes a record refers to its parent "id" via "parent_id" and sometimes to its parent "alt_id" indirectly through a lookup table via "parent_ext_id". Never both.
Lookup table:
ext_id | alt_id |
---|---|
A1L | A1 |
B1L | B1 |
Suppose I'm given ext_id 'A1L' and I want its descendents. E.g.
depth | descendent_id | parent |
---|---|---|
1 | B | A |
1 | C | A |
1 | D | A |
2 | E | B |
2 | F | B |
Here's an example query, which works, but I'm wondering if the CONNECT BY clause is the best it can be because I've had some performance issues with it when I try it with real data...
WITH hierarchy_table AS (
SELECT 'A' id, 'A1' alt_id, null parent_id, null parent_ext_id FROM dual UNION
SELECT 'B' id, 'B1' alt_id, 'A' parent_id, null parent_ext_id FROM dual UNION
SELECT 'C' id, 'C1' alt_id, null parent_id, 'A1L' parent_ext_id FROM dual UNION
SELECT 'D' id, null alt_id, 'A' parent_id, null parent_ext_id FROM dual UNION
SELECT 'E' id, 'E1' alt_id, null parent_id, 'B1L' parent_ext_id FROM dual UNION
SELECT 'F' id, null alt_id, 'B' parent_id, null parent_ext_id FROM dual
),
lookup_table AS (
SELECT 'A1L' ext_id, 'A1' alt_id FROM dual UNION
SELECT 'B1L' ext_id, 'B1' alt_id FROM dual
),
root_node AS (
SELECT * FROM hierarchy_table
WHERE alt_id = (SELECT alt_id FROM lookup_table WHERE ext_id='A1L')
)
SELECT LEVEL AS depth, id,
CASE
WHEN parent_id IS NOT NULL THEN parent_id
ELSE (
SELECT id FROM hierarchy_table WHERE alt_id = (
SELECT alt_id FROM lookup_table WHERE ext_id=ht.parent_ext_id)
)
END AS parent
FROM hierarchy_table ht
-- Start with the children
START WITH parent_ext_id = 'A1L' OR parent_id = (SELECT id FROM root_node)
CONNECT BY NOCYCLE (
(parent_id IS NOT NULL AND PRIOR id = parent_id)
OR
(parent_ext_id IS NOT NULL AND PRIOR alt_id = (
SELECT alt_id FROM lookup_table WHERE ext_id=parent_ext_id))
) ORDER BY depth, id
When I leave out either of the two conditions in the CONNECT BY clause I have no performance issues, but of course I don't get the results I want.
Upvotes: 0
Views: 32
Reputation: 4650
Try by creating an intermediary view (here the CTE vhierarchy_table) LEFT JOINing the lookup_table and back the hierarchy_table:
with hierarchy_table(id, alt_id, parent_id, parent_ext_id) as (
select 'A', 'A1', null, null from dual union all
select 'B', 'B1', 'A', null from dual union all
select 'C', 'C1', null, 'A1L' from dual union all
select 'D', null, 'A', null from dual union all
select 'E', 'E1', null, 'B1L' from dual union all
select 'F', null, 'B', null from dual -- union all
),
lookup_table(ext_id, alt_id) as (
select 'A1L', 'A1' from dual union all
select 'B1L', 'B1' from dual
),
vhierarchy_table as (
select ht.id, ht.alt_id, ht.parent_id, ht.parent_ext_id,
ht1.alt_id as alt_parent_id, ht1.id as parent
from hierarchy_table ht
left join lookup_table l on ht.parent_ext_id = l.ext_id
left join hierarchy_table ht1 on ht1.alt_id = l.alt_id
)
select level, ht.id as descendent_id,
NVL2(ht.parent_id,ht.parent_id,ht.parent) as parent
from vhierarchy_table ht
start with ht.parent_ext_id = 'A1L' or
ht.parent_id = (select ht.id from hierarchy_table ht join lookup_table l on l.alt_id = ht.alt_id where ext_id = 'A1L')
connect by
prior ht.id = ht.parent_id
or
prior ht.alt_id = ht.alt_parent_id
;
Upvotes: 0