Reputation: 3467
My data-set inside table object_type_t
looks something like the following:
OBJ_ID PARENT_OBJ OBJECT_TYPE OBJECT_DESC
--------- ------------ ------------- -----------------------
ES01 <null> ESTATE Bucks Estate
BUI01 ES01 BUILDING Leisure Centre
BUI02 ES01 BUILDING Fire Station
BUI03 <null> BUILDING Housing Block
SQ01 BUI01 ROOM Squash Court
BTR01 BUI02 ROOM Bathroom
AP01 BUI03 APARTMENT Flat No. 1
AP02 BUI03 APARTMENT Flat No. 2
BTR02 AP01 ROOM Bathroom
BDR01 AP01 ROOM Bedroom
BTR03 AP02 ROOM Bathroom
SHR01 BTR01 OBJECT Shower
SHR02 BTR02 OBJECT Shower
SHR03 BTR03 OBJECT Shower
Which in practical, hierarchical terms, looks something like this:
ES01
|--> BUI01
| |--> SQ01
|--> BUI02
| |--> BTR01
|--> SHR01
=======
BUI03
|--> AP01
| |--> BTR02
| | |--> SHR02
| |--> BDR01
|--> AP02
|--> BTR03
|--> SHR03
I know how to use hierarchical queries, such as CONNECT BY PRIOR
. I'm also aware of how to find the root of the tree via connect_by_root
. But what I am looking to do is find a given "level" of a tree (i.e. not the root level, but rather the "BUIDLING" level of a given object).
So for example, I would like to be able to query out every object in the hierarchy, which belongs to BUI01
.
And then in reverse, given an object ID, I would like to be able to query out the associated (say) ROOM object_id
for that object.
Things would be much easier if I could associate each OBJECT_TYPE
with a given level
. But as you see from the above example, BUILDING does not always appear at level 1 in the hierarchy.
My initial idea is to fetch the data into an intermediate tabular format (perhaps a materialized view) which would look like the following. This would allow me to find the data I want by simple SQL queries on the materialized view:
OBJ_ID OBJECT_DESC ESTATE_OBJ BUILDING_OBJ ROOM_OBJ
--------- ---------------- ---------- ------------ ----------
ES01 Bucks Estate ES01
BUI01 Leisure Centre ES01 BUI01
BUI02 Fire Station ES01 BUI02
BUI03 Housing Block BUI03
SQ01 Squash Court ES01 BUI01 SQ01
BTR01 Bathroom ES01 BUI02 BTR01
AP01 Flat No. 1 BUI03
AP02 Flat No. 2 BUI03
BTR02 Bathroom BUI03 BTR02
BDR01 Bedroom BUI03 BDR01
BTR03 Bathroom BUI03 BTR03
SHR01 Shower ES01 BUI02 BTR01
SHR02 Shower BUI03 BTR02
SHR03 Shower BUI03 BTR03
But (short of writing PL/SLQ, which I would like to avoid), I haven't been able to concisely structure a query which would achieve this tabular format.
Does anyone know how I can do this? Can it be done?
Solutions must be executable in Oracle 12c.
Additionally: Performance is important, since my underlying data structure contains several hundred-thousand lines, and structures can be quite deep. So faster solutions will be preferred over slower ones :-)
Thanks for your help, in advance.
Upvotes: 1
Views: 334
Reputation: 350147
The desired output has 3 columns which are determined by object types. In general this could be extended with more columns, one for each possible value for the field object_type
. Even with the given example data, one could imagine an additional column apartment_obj
.
To make this generic without the need to self-join the table as many times as there are object type values, one could use a combination of CONNECT BY
and PIVOT
clauses:
SELECT *
FROM (
SELECT obj_id,
object_desc,
CONNECT_BY_ROOT obj_id AS pivot_col_value,
CONNECT_BY_ROOT object_type AS pivot_col_name
FROM object_type_t
-- skip the STARTS WITH clause to get all connected pairs
CONNECT BY parent_obj = PRIOR obj_id
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
The FOR ... IN
clause has a hard-coded list of names of the desired columns -- without the _obj
suffix, as that gets added during the pivot transformation.
Oracle does not allow this list to be dynamically retrieved. NB: there is an exception to this rule when using the PIVOT XML
syntax, but there you get the XML output in one column, which you would then need to parse. That would be rather inefficient.
The sub-query with the CONNECT BY
clause does not have a STARTS WITH
clause, which makes that query take any record as starting point and produce the descendants from there. Together with the CONNECT_BY_ROOT
selection, this allows to produce a full list of all connected pairs, where the distance between the two in the hierarchy can be anything. The JOIN
then matches the deeper of the two, so you get all ancestors of that node (including the node itself). And those ancestors are then pivoted into columns.
The CONNECT BY
sub-query could also be written in way that the hierarchy is traversed backwards. The output is the same, but maybe there is a performance difference. If so, I think that variation could have better performance, but I did not test this on large datasets:
SELECT *
FROM (
SELECT CONNECT_BY_ROOT obj_id AS obj_id,
CONNECT_BY_ROOT object_desc AS object_desc,
obj_id AS pivot_col_value,
object_type AS pivot_col_name
FROM object_type_t
-- Connect in backward direction:
CONNECT BY obj_id = PRIOR parent_obj
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
Note that in this variant the CONNECT_BY_ROOT
returns the deeper node of the pair, because of the opposite traversal.
You could use this query:
SELECT t1.obj_id,
t1.object_desc,
CASE 'ESTATE'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END estate_obj,
CASE 'BUILDING'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END building_obj,
CASE 'ROOM'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END room_obj
FROM object_type_t t1
LEFT JOIN object_type_t t2 ON t2.obj_id = t1.parent_obj
LEFT JOIN object_type_t t3 ON t3.obj_id = t2.parent_obj
Upvotes: 1
Reputation: 3467
With many thanks to @trincot for the inspiration, I've carved out the following solution. It's not extremely quick on production data, but it does work on an arbitrarily deep tree. The only way in which this is not dynamic, is that one has to choose in advance which levels of the tree are to be extracted, and an additional column must be added to capture that data.
The principle is that one can build a sys_connect_by_path
column, and use regular expressions to extract the required level-data from there.
WITH base_data (obj_id, parent_obj, object_type, object_desc) AS (
SELECT 'ES01','','ESTATE','Bucks Estate' FROM dual union all
SELECT 'BUI01','ES01','BUILDING','Leisure Centre' FROM dual union all
SELECT 'BUI02','ES01','BUILDING','Fire Station' FROM dual union all
SELECT 'BUI03','','BUILDING','Housing Block' FROM dual union all
SELECT 'SQ01','BUI01','ROOM','Squash Court' FROM dual union all
SELECT 'BTR01','BUI02','ROOM','Bathroom' FROM dual union all
SELECT 'AP01','BUI03','APARTMENT','Flat No. 1' FROM dual union all
SELECT 'AP02','BUI03','APARTMENT','Flat No. 2' FROM dual union all
SELECT 'BTR02','AP01','ROOM','Bathroom' FROM dual union all
SELECT 'BDR01','AP01','ROOM','Bedroom' FROM dual union all
SELECT 'BTR03','AP02','ROOM','Bathroom' FROM dual union all
SELECT 'SHR01','BTR01','OBJECT','Shower' FROM dual union all
SELECT 'SHR02','BTR02','OBJECT','Shower' FROM dual union all
SELECT 'SHR03','BTR03','OBJECT','Shower' FROM dual ),
obj_hierarchy AS (
SELECT object_type, obj_id, object_desc, parent_obj, sys_connect_by_path(object_type||':'||obj_id,'/')||'/' r_path
FROM base_data
START WITH parent_obj IS null
CONNECT BY PRIOR obj_id = parent_obj
)
SELECT obj_id, object_desc,
CASE
WHEN instr(h.r_path, 'ESTATE:') > 1
THEN regexp_replace (h.r_path,'.*/ESTATE:([^/]+).*$', '\1')
ELSE ''
END obj_estate,
CASE
WHEN instr(h.r_path, 'BUILDING:') > 1
THEN regexp_replace (h.r_path,'.*/BUILDING:([^/]+).*$', '\1')
ELSE ''
END obj_building,
CASE
WHEN instr(h.r_path, 'ROOM:') > 1
THEN regexp_replace (h.r_path,'.*/ROOM:([^/]+).*$', '\1')
ELSE ''
END obj_room
FROM obj_hierarchy h
Upvotes: 0
Reputation: 22949
If I correctly understand your need, maybe you can avoid the tabular view, directly querying your table;
Say you want to find all the objects belonging to BUI01
, you can try:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(
select 'ES01','','ESTATE','Bucks Estate' from dual union all
select 'BUI01','ES01','BUILDING','Leisure Centre' from dual union all
select 'BUI02','ES01','BUILDING','Fire Station' from dual union all
select 'BUI03','','BUILDING','Housing Block' from dual union all
select 'SQ01','BUI01','ROOM','Squash Court' from dual union all
select 'BTR01','BUI02','ROOM','Bathroom' from dual union all
select 'AP01','BUI03','APARTMENT','Flat No. 1' from dual union all
select 'AP02','BUI03','APARTMENT','Flat No. 2' from dual union all
select 'BTR02','AP01','ROOM','Bathroom' from dual union all
select 'BDR01','AP01','ROOM','Bedroom' from dual union all
select 'BTR03','AP02','ROOM','Bathroom' from dual union all
select 'SHR01','BTR01','OBJECT','Shower' from dual union all
select 'SHR02','BTR02','OBJECT','Shower' from dual union all
select 'SHR03','BTR03','OBJECT','Shower' from dual
)
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by prior obj_id = parent_obj
start with obj_ID = 'BUI01'
This considers BUI01
belonging to itself; if you don't want this, you can modify the query in quite simple way to cut off your starting value.
On the opposite way, say you're looking for the room in which SHR01
is, you can try with the following; it's basically the same recursive idea, but in ascending order, instead of descending the tree:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(...
)
SELECT *
FROM (
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by obj_id = PRIOR parent_obj
start with obj_ID = 'SHR01'
)
WHERE object_type = 'ROOM'
In both cases, you only scan your table once, without any other structure; this way, this has a chance to be fast enough.
Upvotes: 2