cartbeforehorse
cartbeforehorse

Reputation: 3467

Fetch object of specified level-type from data hierarchy (Oracle 12 SQL)

My data-set inside table object_type_t looks something like the following:

OBJ_ID    PARENT_OBJ   OBJECT_TYPE   OBJECT_DESC
--------- ------------ ------------- -----------------------
ES01      <null>       ESTATE        Bucks Estate
BUI01     ES01         BUILDING      Leisure Centre
BUI02     ES01         BUILDING      Fire Station
BUI03     <null>       BUILDING      Housing Block
SQ01      BUI01        ROOM          Squash Court
BTR01     BUI02        ROOM          Bathroom 
AP01      BUI03        APARTMENT     Flat No. 1
AP02      BUI03        APARTMENT     Flat No. 2
BTR02     AP01         ROOM          Bathroom
BDR01     AP01         ROOM          Bedroom
BTR03     AP02         ROOM          Bathroom
SHR01     BTR01        OBJECT        Shower
SHR02     BTR02        OBJECT        Shower
SHR03     BTR03        OBJECT        Shower

Which in practical, hierarchical terms, looks something like this:

ES01
  |--> BUI01
  |      |--> SQ01
  |--> BUI02
  |      |--> BTR01
                |--> SHR01
=======
BUI03
  |--> AP01
  |      |--> BTR02
  |      |      |--> SHR02
  |      |--> BDR01
  |--> AP02
         |--> BTR03
                |--> SHR03

I know how to use hierarchical queries, such as CONNECT BY PRIOR. I'm also aware of how to find the root of the tree via connect_by_root. But what I am looking to do is find a given "level" of a tree (i.e. not the root level, but rather the "BUIDLING" level of a given object).

So for example, I would like to be able to query out every object in the hierarchy, which belongs to BUI01.

And then in reverse, given an object ID, I would like to be able to query out the associated (say) ROOM object_id for that object.

Things would be much easier if I could associate each OBJECT_TYPE with a given level. But as you see from the above example, BUILDING does not always appear at level 1 in the hierarchy.

My initial idea is to fetch the data into an intermediate tabular format (perhaps a materialized view) which would look like the following. This would allow me to find the data I want by simple SQL queries on the materialized view:

OBJ_ID    OBJECT_DESC      ESTATE_OBJ BUILDING_OBJ ROOM_OBJ
--------- ---------------- ---------- ------------ ----------
ES01      Bucks Estate     ES01
BUI01     Leisure Centre   ES01       BUI01
BUI02     Fire Station     ES01       BUI02
BUI03     Housing Block               BUI03
SQ01      Squash Court     ES01       BUI01        SQ01
BTR01     Bathroom         ES01       BUI02        BTR01
AP01      Flat No. 1                  BUI03
AP02      Flat No. 2                  BUI03
BTR02     Bathroom                    BUI03        BTR02
BDR01     Bedroom                     BUI03        BDR01
BTR03     Bathroom                    BUI03        BTR03
SHR01     Shower           ES01       BUI02        BTR01
SHR02     Shower                      BUI03        BTR02
SHR03     Shower                      BUI03        BTR03

But (short of writing PL/SLQ, which I would like to avoid), I haven't been able to concisely structure a query which would achieve this tabular format.

Does anyone know how I can do this? Can it be done?

Solutions must be executable in Oracle 12c.

Additionally: Performance is important, since my underlying data structure contains several hundred-thousand lines, and structures can be quite deep. So faster solutions will be preferred over slower ones :-)

Thanks for your help, in advance.

Upvotes: 1

Views: 334

Answers (3)

trincot
trincot

Reputation: 350147

The desired output has 3 columns which are determined by object types. In general this could be extended with more columns, one for each possible value for the field object_type. Even with the given example data, one could imagine an additional column apartment_obj.

To make this generic without the need to self-join the table as many times as there are object type values, one could use a combination of CONNECT BY and PIVOT clauses:

SELECT  *
FROM    (
            SELECT     obj_id,
                       object_desc,
                       CONNECT_BY_ROOT obj_id      AS pivot_col_value,
                       CONNECT_BY_ROOT object_type AS pivot_col_name
            FROM       object_type_t
            -- skip the STARTS WITH clause to get all connected pairs
            CONNECT BY parent_obj = PRIOR obj_id
        )
PIVOT   (
            MAX(pivot_col_value) AS obj
            FOR (pivot_col_name) IN (
                'ESTATE'   AS estate,
                'BUILDING' AS building,
                'ROOM'     AS room
            )
        );

The FOR ... IN clause has a hard-coded list of names of the desired columns -- without the _obj suffix, as that gets added during the pivot transformation.

Oracle does not allow this list to be dynamically retrieved. NB: there is an exception to this rule when using the PIVOT XML syntax, but there you get the XML output in one column, which you would then need to parse. That would be rather inefficient.

The sub-query with the CONNECT BY clause does not have a STARTS WITH clause, which makes that query take any record as starting point and produce the descendants from there. Together with the CONNECT_BY_ROOT selection, this allows to produce a full list of all connected pairs, where the distance between the two in the hierarchy can be anything. The JOIN then matches the deeper of the two, so you get all ancestors of that node (including the node itself). And those ancestors are then pivoted into columns.

The CONNECT BY sub-query could also be written in way that the hierarchy is traversed backwards. The output is the same, but maybe there is a performance difference. If so, I think that variation could have better performance, but I did not test this on large datasets:

SELECT  *
FROM    (
            SELECT     CONNECT_BY_ROOT obj_id      AS obj_id,
                       CONNECT_BY_ROOT object_desc AS object_desc,
                       obj_id                      AS pivot_col_value,
                       object_type                 AS pivot_col_name
            FROM       object_type_t
            -- Connect in backward direction:
            CONNECT BY obj_id = PRIOR parent_obj
        )
PIVOT   (
            MAX(pivot_col_value) AS obj
            FOR (pivot_col_name) IN (
                'ESTATE'   AS estate,
                'BUILDING' AS building,
                'ROOM'     AS room
            )
        );

Note that in this variant the CONNECT_BY_ROOT returns the deeper node of the pair, because of the opposite traversal.

Alternative based on self-joins (previous answer)

You could use this query:

SELECT    t1.obj_id, 
          t1.object_desc,
          CASE 'ESTATE'
              WHEN t1.object_type THEN t1.obj_id
              WHEN t2.object_type THEN t2.obj_id
              WHEN t3.object_type THEN t3.obj_id
          END estate_obj,
          CASE 'BUILDING'
              WHEN t1.object_type THEN t1.obj_id
              WHEN t2.object_type THEN t2.obj_id
              WHEN t3.object_type THEN t3.obj_id
          END building_obj,
          CASE 'ROOM'
              WHEN t1.object_type THEN t1.obj_id
              WHEN t2.object_type THEN t2.obj_id
              WHEN t3.object_type THEN t3.obj_id
          END room_obj
FROM      object_type_t t1
LEFT JOIN object_type_t t2 ON t2.obj_id = t1.parent_obj
LEFT JOIN object_type_t t3 ON t3.obj_id = t2.parent_obj

Upvotes: 1

cartbeforehorse
cartbeforehorse

Reputation: 3467

With many thanks to @trincot for the inspiration, I've carved out the following solution. It's not extremely quick on production data, but it does work on an arbitrarily deep tree. The only way in which this is not dynamic, is that one has to choose in advance which levels of the tree are to be extracted, and an additional column must be added to capture that data.

The principle is that one can build a sys_connect_by_path column, and use regular expressions to extract the required level-data from there.

WITH base_data (obj_id, parent_obj, object_type, object_desc) AS (
   SELECT 'ES01','','ESTATE','Bucks Estate' FROM dual union all
   SELECT 'BUI01','ES01','BUILDING','Leisure Centre' FROM dual union all
   SELECT 'BUI02','ES01','BUILDING','Fire Station' FROM dual union all
   SELECT 'BUI03','','BUILDING','Housing Block' FROM dual union all
   SELECT 'SQ01','BUI01','ROOM','Squash Court' FROM dual union all
   SELECT 'BTR01','BUI02','ROOM','Bathroom' FROM dual union all
   SELECT 'AP01','BUI03','APARTMENT','Flat No. 1' FROM dual union all
   SELECT 'AP02','BUI03','APARTMENT','Flat No. 2' FROM dual union all
   SELECT 'BTR02','AP01','ROOM','Bathroom' FROM dual union all
   SELECT 'BDR01','AP01','ROOM','Bedroom' FROM dual union all
   SELECT 'BTR03','AP02','ROOM','Bathroom' FROM dual union all
   SELECT 'SHR01','BTR01','OBJECT','Shower' FROM dual union all
   SELECT 'SHR02','BTR02','OBJECT','Shower' FROM dual union all
   SELECT 'SHR03','BTR03','OBJECT','Shower' FROM dual ),
obj_hierarchy AS (
   SELECT object_type, obj_id, object_desc, parent_obj, sys_connect_by_path(object_type||':'||obj_id,'/')||'/' r_path
   FROM   base_data
   START WITH parent_obj IS null
   CONNECT BY PRIOR obj_id = parent_obj
)
SELECT obj_id, object_desc,
       CASE
          WHEN instr(h.r_path, 'ESTATE:') > 1
          THEN regexp_replace (h.r_path,'.*/ESTATE:([^/]+).*$', '\1')
          ELSE ''
       END obj_estate,
       CASE
          WHEN instr(h.r_path, 'BUILDING:') > 1
          THEN regexp_replace (h.r_path,'.*/BUILDING:([^/]+).*$', '\1')
          ELSE ''
       END obj_building,
       CASE
          WHEN instr(h.r_path, 'ROOM:') > 1
          THEN regexp_replace (h.r_path,'.*/ROOM:([^/]+).*$', '\1')
          ELSE ''
       END obj_room
FROM   obj_hierarchy h

Upvotes: 0

Aleksej
Aleksej

Reputation: 22949

If I correctly understand your need, maybe you can avoid the tabular view, directly querying your table;

Say you want to find all the objects belonging to BUI01, you can try:

with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(
select 'ES01','','ESTATE','Bucks Estate' from dual union all
select 'BUI01','ES01','BUILDING','Leisure Centre' from dual union all
select 'BUI02','ES01','BUILDING','Fire Station' from dual union all
select 'BUI03','','BUILDING','Housing Block' from dual union all
select 'SQ01','BUI01','ROOM','Squash Court' from dual union all
select 'BTR01','BUI02','ROOM','Bathroom' from dual union all
select 'AP01','BUI03','APARTMENT','Flat No. 1' from dual union all
select 'AP02','BUI03','APARTMENT','Flat No. 2' from dual union all
select 'BTR02','AP01','ROOM','Bathroom' from dual union all
select 'BDR01','AP01','ROOM','Bedroom' from dual union all
select 'BTR03','AP02','ROOM','Bathroom' from dual union all
select 'SHR01','BTR01','OBJECT','Shower' from dual union all
select 'SHR02','BTR02','OBJECT','Shower' from dual union all
select 'SHR03','BTR03','OBJECT','Shower' from dual
)
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by prior obj_id = parent_obj
start with obj_ID = 'BUI01'

This considers BUI01 belonging to itself; if you don't want this, you can modify the query in quite simple way to cut off your starting value.

On the opposite way, say you're looking for the room in which SHR01 is, you can try with the following; it's basically the same recursive idea, but in ascending order, instead of descending the tree:

with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(...
)
SELECT *
FROM (
        select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
        from test
        connect by obj_id = PRIOR parent_obj
        start with obj_ID = 'SHR01'
)
WHERE object_type = 'ROOM'

In both cases, you only scan your table once, without any other structure; this way, this has a chance to be fast enough.

Upvotes: 2

Related Questions