dk96m
dk96m

Reputation: 331

Multi-Table Hierarchy Query

I am trying to create a hierarchical query in SQL Server.

I have 4 tables:

Org3 is the highest level and org 6 is the lowest. As you can see org6 has a parent id in the org5 table, which has a parent id in the org4 table, which has a parent id in the org3 table.

It is possible for a parent to not have any children.

So say I have these values in the tables:

Org3 table

| org3_id | org3_name |
+---------+-----------+
| 1       | MS        |
| 2       | NS        |

Org4 table

| org4_id | org4_name | org4_org3id |
+---------+-----------+-------------+
| 1       | TS        | 1           |
| 2       | QS        | 1           |
| 3       | BS        | 1           |

Org5 table

| org5_id | org5_name | org5_org4id |
+---------+-----------+-------------+
| 1       | LS        | 1           |
| 2       | PS        | 1           |
| 3       | VS        | 2           |

Org6 table

| org6_id | org6_name | org6_org5id |
+---------+-----------+-------------+
| 1       | AS        | 1           |
| 2       | RS        | 1           |
| 3       | ZS        | 2           |

The result I would like to get is:

| org3_id | org3_name | org4_id | org4_name | org5_id | org5_name | org6_id | org6_name | path        |
|---------|-----------|---------|-----------|---------|-----------|---------|-----------|-------------|
| 1       | MS        | NULL    | NULL      | NULL    | NULL      | NULL    | NULL      | MS          |
| 1       | MS        | 1       | TS        | NULL    | NULL      | NULL    | NULL      | MS\TS       |
| 1       | MS        | 1       | TS        | 1       | LS        | NULL    | NULL      | MS\TS\LS    |
| 1       | MS        | 1       | TS        | 1       | LS        | 1       | AS        | MS\TS\LS\AS |
| 1       | MS        | 1       | TS        | 1       | LS        | 2       | RS        | MS\TS\LS\RS |
| 1       | MS        | 1       | TS        | 2       | PS        | NULL    | NULL      | MS\TS\PS    |
| 1       | MS        | 1       | TS        | 2       | PS        | 3       | ZS        | MS\TS\PS\ZS |
| 1       | MS        | 2       | QS        | NULL    | NULL      | NULL    | NULL      | MS\QS       |
| 1       | MS        | 2       | QS        | 3       | VS        | NULL    | NULL      | MS\QS\VS    |
| 1       | MS        | 3       | BS        | NULL    | NULL      | NULL    | NULL      | MS\BS       |
| 2       | NS        | NULL    | NULL      | NULL    | NULL      | NULL    | NULL      | NS          |

This is what I have tried.

SELECT 
    org3.org3_id,
    org3.org3_name,
    org3.org3_open_ind,
    org4.org4_id,
    org4.org4_name,
    org4.org4_open_ind,
    org5.org5_id,
    org5.org5_name,
    org5.org5_open_ind,
    org6.org6_id,
    org6.org6_name,
    org6.org6_open_ind,
    CONCAT(org3.org3_abbrv, '\', org4.org4_abbrv, 
           CASE
              WHEN org5.org5_abbrv IS NULL THEN ''
              ELSE CONCAT('\', org5.org5_abbrv)
           END, 
           CASE
              WHEN org6.org6_abbrv IS NULL THEN ''
              ELSE CONCAT('\', org6.org6_abbrv)
           END) AS [ORG PATH]                       
FROM 
    (SELECT
         *
     FROM
         TSTAFFORG3 
     WHERE
         org3_open_ind = 1) org3
LEFT OUTER JOIN 
    (SELECT
         *
     FROM
         TSTAFFORG4 
     WHERE
         org4_open_ind = 1) org4 ON org4.org4_org3id = org3.org3_id
LEFT OUTER JOIN 
    (SELECT
         *
     FROM
         TSTAFFORG5
     WHERE
         org5_open_ind = 1) org5 ON org5.org5_org4id = org4.org4_id
LEFT OUTER JOIN 
    (SELECT
         *
     FROM 
         TSTAFFORG6
     WHERE
         org6_open_ind = 1) org6 ON org6.org6_org5id = org5.org5_id
ORDER BY
    org3.org3_name, org4.org4_name, org5.org5_name, org6.org6_name

I think maybe a CTE query is needed, but I am not sure how to frame it in this case. If it was all in one table I think I could figure it out, but because it is multiple tables, I am having trouble figuring out the SQL. The query I tried doesn't show just the parent. It will only show where results where org3 has children.

Upvotes: 1

Views: 470

Answers (2)

Zack
Zack

Reputation: 2341

To be clear: You're modelling hierarchical data. There are a number of ways of storing hierarchical data in a RDBMS. Two are:

  1. Adjacency lists (e.g. self-referential tables)
  2. Materialized paths (e.g. TS/MS/RS).

Your data model seems problematic: If you wanted to add another level, would you add a new table?

If you are able to, you should move everything into one table:

orgs(org_id, org_name, parent_org)

This uses the adjacency list approach.

Then you could create a simple recursive CTE (or a number of self-joins) to get the materialized path.

Eschewing the Org ID (which would have to be regenerated if you put everything in one table), the following query will give you the results below:

WITH 
    -- sample data (I'm only using org names, not org IDs).
    Orgs(org_name, parent_org) AS
    (
        SELECT * FROM
        (
            VALUES
            ('MS', NULL),
            ('NS', NULL),
            ('TS', 'MS'),
            ('QS', 'MS'),
            ('BS', 'MS'),
            ('LS', 'TS'),
            ('PS', 'TS'),
            ('VS', 'QS'),
            ('AS', 'LS'),
            ('RS', 'LS'),
            ('ZS', 'PS')
        ) v(c1, c2)
    ),
    -- hierarchical/recursive CTE
    OrgsWithPath(org_name, parent_org, org_path) AS
    (
        SELECT org_name, parent_org, CAST(org_name AS VARCHAR(MAX))
        FROM Orgs
        WHERE parent_org IS NULL

        UNION ALL

        SELECT Orgs.org_name, Orgs.parent_org, OrgsWithPath.org_path + '\' + Orgs.org_name
        FROM OrgsWithPath
            INNER JOIN Orgs ON
                Orgs.parent_org = OrgsWithPath.org_name
    )

SELECT * FROM OrgsWithPath ORDER BY org_path
+----------+------------+-------------+
| org_name | parent_org |  org_path   |
+----------+------------+-------------+
| MS       | NULL       | MS          |
| BS       | MS         | MS\BS       |
| QS       | MS         | MS\QS       |
| VS       | QS         | MS\QS\VS    |
| TS       | MS         | MS\TS       |
| LS       | TS         | MS\TS\LS    |
| AS       | LS         | MS\TS\LS\AS |
| RS       | LS         | MS\TS\LS\RS |
| PS       | TS         | MS\TS\PS    |
| ZS       | PS         | MS\TS\PS\ZS |
| NS       | NULL       | NS          |
+----------+------------+-------------+

Note the ORDER BY in the final SELECT: This determines whether your query is depth-first (traversing the full path) or breadth-first (starting with all the top-level nodes, then proceeding). With this approach, it's also trivial to include a "Level", so you know whether it's a top-level node or some other level.

Getting the additional columns is a bit trickier, but can also be handled by the recursive CTE (using CASEand COALESCE):

WITH 
    OrgsWithPath(org_name, org_path, org_level, org3_name, org4_name, org5_name, org6_name) AS
    (
        SELECT 
            Orgs.org_name, 
            CAST(org_name AS VARCHAR(MAX)), 
            1,
            Orgs.org_name, 
            CAST(NULL AS VARCHAR(255)), 
            CAST(NULL AS VARCHAR(255)), 
            CAST(NULL AS VARCHAR(255))
        FROM Orgs
        WHERE parent_org IS NULL

        UNION ALL

        SELECT 
            Orgs.org_name,
            OrgsWithPath.org_path + '\' + Orgs.org_name,
            OrgsWithPath.org_level + 1,
            OrgsWithPath.org3_name,
            CASE WHEN OrgsWithPath.org_level+1 >= 2 THEN COALESCE(OrgsWithPath.org4_name, Orgs.org_name) END,
            CASE WHEN OrgsWithPath.org_level+1 >= 3 THEN COALESCE(OrgsWithPath.org5_name, Orgs.org_name) END,
            CASE WHEN OrgsWithPath.org_level+1 >= 4 THEN COALESCE(OrgsWithPath.org6_name, Orgs.org_name) END
        FROM OrgsWithPath
            INNER JOIN Orgs ON
                Orgs.parent_org = OrgsWithPath.org_name
    )

SELECT *
FROM OrgsWithPath 
ORDER BY org_path
+----------+-------------+-----------+-----------+-----------+-----------+-----------+
| org_name |  org_path   | org_level | org3_name | org4_name | org5_name | org6_name |
+----------+-------------+-----------+-----------+-----------+-----------+-----------+
| MS       | MS          |         1 | MS        | NULL      | NULL      | NULL      |
| BS       | MS\BS       |         2 | MS        | BS        | NULL      | NULL      |
| QS       | MS\QS       |         2 | MS        | QS        | NULL      | NULL      |
| VS       | MS\QS\VS    |         3 | MS        | QS        | VS        | NULL      |
| TS       | MS\TS       |         2 | MS        | TS        | NULL      | NULL      |
| LS       | MS\TS\LS    |         3 | MS        | TS        | LS        | NULL      |
| AS       | MS\TS\LS\AS |         4 | MS        | TS        | LS        | AS        |
| RS       | MS\TS\LS\RS |         4 | MS        | TS        | LS        | RS        |
| PS       | MS\TS\PS    |         3 | MS        | TS        | PS        | NULL      |
| ZS       | MS\TS\PS\ZS |         4 | MS        | TS        | PS        | ZS        |
| NS       | NS          |         1 | NS        | NULL      | NULL      | NULL      |
+----------+-------------+-----------+-----------+-----------+-----------+-----------+

Upvotes: 2

Luis Cazares
Luis Cazares

Reputation: 3585

This is a possible solution but I'm not how it would perform. It's basically adding an empty row to all levels.

        SELECT 
                org3.org3_id,
                org3.org3_name,
                org3.org3_open_ind,
                org4.org4_id,
                org4.org4_name,
                org4.org4_open_ind,
                org5.org5_id,
                org5.org5_name,
                org5.org5_open_ind,
                org6.org6_id,
                org6.org6_name,
                org6.org6_open_ind,
                CONCAT
                    (
                        org3.org3_name, 
                        '\' + org4.org4_name, 
                        '\' + org5.org5_name, 
                        '\' + org6.org6_name
                    )   AS [ORG PATH]                       
        FROM TSTAFFORG3 org3
        CROSS APPLY 
                (
                    SELECT  org4_id, org4_name, org4_open_ind, org4_org3id
                    FROM    TSTAFFORG4 
                    WHERE   org4_open_ind = 1
                    AND     org4_org3id = org3.org3_id
                    UNION ALL
                    SELECT NULL, NULL, NULL, org3.org3_id
                ) org4 
                CROSS APPLY
                (
                    SELECT  org5_id, org5_name, org5_open_ind, org5_org4id
                    FROM    TSTAFFORG5
                    WHERE   org5_open_ind = 1
                    AND     org5_org4id = org4.org4_id
                    UNION ALL
                    SELECT NULL, NULL, NULL, org4.org4_id
                ) org5 
                OUTER APPLY
                (
                    SELECT  org6_id, org6_name, org6_open_ind, org6_org5id
                    FROM    TSTAFFORG6
                    WHERE   org6_open_ind = 1
                    AND     org6_org5id = org5.org5_id
                    UNION ALL
                    SELECT NULL, NULL, NULL, org4.org4_id
                ) org6 
                    WHERE
                            org3_open_ind = 1
        ORDER BY
                org3.org3_name, org4.org4_name, org5.org5_name, org6.org6_name;

In case someone else has a different idea, I'm leaving the sample data in a consumable format.

CREATE TABLE TSTAFFORG3( 
 org3_id    int,
 org3_name  varchar(10),
 org3_open_ind  bit);

 INSERT INTO TSTAFFORG3 
 VALUES
    ( 1, 'MS', 1),
    ( 2, 'NS', 1);

CREATE TABLE TSTAFFORG4( 
 org4_id    int,
 org4_name  varchar(10),
 org4_org3id    int,
 org4_open_ind  bit);

 INSERT INTO TSTAFFORG4
 VALUES
    ( 1, 'TS', 1, 1),
    ( 2, 'QS', 1, 1),
    ( 3, 'BS', 1, 1);

CREATE TABLE TSTAFFORG5( 
 org5_id    int,
 org5_name  varchar(10),
 org5_org4id    int,
 org5_open_ind  bit);

 INSERT INTO TSTAFFORG5
 VALUES
    ( 1, 'LS', 1, 1),
    ( 2, 'PS', 1, 1),
    ( 3, 'VS', 2, 1);

CREATE TABLE TSTAFFORG6( 
 org6_id    int,
 org6_name  varchar(10),
 org6_org5id    int,
 org6_open_ind  bit);

 INSERT INTO TSTAFFORG6
 VALUES
    ( 1, 'AS', 1, 1),
    ( 2, 'RS', 1, 1),
    ( 3, 'ZS', 2, 1);

Upvotes: 2

Related Questions