Reputation: 71
I would like to get help for a hierarchical query (Oracle 11gR2). I have a hard time with those kind of queries...
In fact, it's a 2 in 1 question (2 different approches needed).
I’m looking for a way to get the distance from all individials records to the root (not the opposite). My data are in a tree like structure:
CREATE TABLE MY_TREE
(ID_NAME VARCHAR2(1) PRIMARY KEY,
PARENT_ID VARCHAR2(1),
PARENT_DISTANCE NUMBER(2)
);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('A',NULL,NULL);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('B','A',1);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('C','B',3);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('D','B',5);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('E','C',7);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('F','D',11);
INSERT INTO MY_TREE (ID_NAME,PARENT_ID,PARENT_DISTANCE) VALUES('G','D',13);
Hierarchically, my data look like this (but I have multiple independents roots and many more levels):
In the first approch, I'm looking for a query that will give me this result:
LEVEL ROOT NODE ID_NAME ROOT_DISTANCE
----- ---- ---- ------- -------------
1 A null A null
2 A null B 1
3 A B C 4
4 A B E 11
3 A B D 6
4 A D F 17
4 A D G 19
In this result,
In this approch, I will always specify a maximum of 2 roots.
The second approch must be a PL/SQL script that will do the same calculation (ROOT_DISTANCE), but in a iterative way, and that will write the result in a new table. I want to run this script one time, so all roots (~1000) will be processed.
Here's the way I see the script:
This script is needed for "performance perspectives", so if an element have been already calculated (ex: a split node that was calculated by another leaf), we need to stop the calculation and pass to the next leaf because we already know the result from there to the root. For example, if the system calculates E-C-B-A, and then F-D-B-A, the B-A section should not be calcultated again because it was done in the first pass.
You can awnser one or both of those questions, but i will need the awnser to those two questions.
Thank you!
Upvotes: 4
Views: 1705
Reputation:
Here is how you can solve this with a hierarchical (connect by
) query.
In most hierarchical problems, hierarchical queries will be faster than recursive queries (recursive with
clause). However, your question is not purely hierarchical - you need to compute the distance to root, and unlike recursive with
, you can't get that in a single pass with hierarchical queries. So it will be interesting to hear - from you! - whether there are any significant performance differences between the approaches. For what it's worth, on the very small data sample you provided, the Optimizer estimated a cost of 5 using connect by
, vs. 48 for the recursive solution; whether that means anything in real life you will find out and hopefully you will let us know, too.
In the hierarchical query, I flag out the parents that have two or more children (I use an analytic function for this, to avoid a join). Then I build the hierarchy, and in the last step I aggregate to get the required bits. As in the recursive solution, this should give you everything you need - for all roots and all nodes - in a single SQL query; there is no need for PL/SQL code.
with
branching (id_name, parent_id, parent_distance, b) as (
select id_name, parent_id, parent_distance,
case when count(*) over (partition by parent_id) > 1 then 'y' end
from my_tree
)
, hierarchy (lvl, leaf, id_name, parent_id, parent_distance, b) as (
select level, connect_by_root id_name, id_name, parent_id, parent_distance, b
from branching
connect by id_name = prior parent_id
)
select max(lvl) as lvl,
min(id_name) keep (dense_rank last order by lvl) as root,
leaf as id_name,
min(decode(b, 'y', parent_id))
keep (dense_rank first order by decode(b, 'y', lvl)) as node,
sum(parent_distance) as root_distance
from hierarchy
group by leaf;
LVL ROOT ID_NAME NODE ROOT_DISTANCE
--- ------- ------- ------- -------------
1 A A
2 A B 1
3 A C B 4
3 A D B 6
4 A E B 11
4 A F D 17
4 A G D 19
Upvotes: 0
Reputation: 36107
Try this one:
WITH brumba(le_vel,root,node,id_name,root_distance) AS (
SELECT 1 as le_vel, id_name as root, null as node, id_name, to_number(null) as root_distance
FROM MY_TREE WHERE parent_id IS NULL
UNION ALL
SELECT b.le_vel + 1, b.root,
CASE WHEN 1 < (
SELECT count(*) FROM MY_TREE t1 WHERE t1.parent_id = t.parent_id
)
THEN t.parent_id ELSE b.node
END,
t.id_name, coalesce(b.root_distance,0)+t.parent_distance
FROM MY_TREE t
JOIN brumba b ON b.id_name = t.parent_id
)
SELECT * FROM brumba
Demo: https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=d5c231055e989c3cbcd763f4b3d3033f
There is no need for "the second approch" using PL/SQL - the above SQL will calculate results for all root nodes (which have null in parent_id
column) at once.
Just add a prefix either INSERT INTO tablename(col1,col2, ... colN) ...
or CREATE TABLE name AS ...
to the above query.
The above demo contains an example for the latter option CREATE TABLE xxx AS query
Upvotes: 3
Reputation: 142705
Here's one option which shows how to get the first part of your question:
SQL> with temp as
2 (select level lvl,
3 ltrim(sys_connect_by_path(id_name, ','), ',') path
4 from my_tree
5 start with parent_id is null
6 connect by prior id_name = parent_id
7 ),
8 inter as
9 (select t.lvl,
10 t.path,
11 regexp_substr(t.path, '[^,]+', 1, column_value) col
12 from temp t,
13 table(cast(multiset(select level from dual
14 connect by level <= regexp_count(path, ',') + 1
15 ) as sys.odcinumberlist ))
16 )
17 select i.lvl,
18 i.path,
19 sum(m.parent_distance) dis
20 from inter i join my_tree m on m.id_name = i.col
21 group by i.lvl, i.path
22 order by i.path;
LVL PATH DIS
---- ---------- ----------
1 A
2 A,B 1
3 A,B,C 4
4 A,B,C,E 11
3 A,B,D 6
4 A,B,D,F 17
4 A,B,D,G 19
7 rows selected.
SQL>
Upvotes: 1