wdonahoe
wdonahoe

Reputation: 1073

Remove all shorter subsequences containing an ID in a recursive CTE

Very confusing title! Let me elaborate a little bit.

I have two tables; one that represents a module (a certain type of web page that can contain other modules) called module_info, and another that represents each page's location within that hierarchy of modules called www_menu.

module_info:

+----------+------------+---------+
| moduleID | modulename | menu_id |
+----------+------------+---------+
|        1 | tests      |       1 |
|        2 | docs       |       2 |
|        3 | mail       |       3 |
|        4 | networks   |       4 |
|        5 | payroll    |       5 |
|        6 | admin      |       6 |
|        7 | travel     |       7 |
|        8 | bios       |       8 |
+----------+------------+---------+

www_menu:

+--------+--------+-------------------+------------------+
| menuID | parent |       title       |     location     |
+--------+--------+-------------------+------------------+
|      1 |      0 | Tests             | modules/tests    |
|      2 |      0 | Testing Documents | modules/docs     |
|      3 |      0 | Mailing Lists     | modules/mail     |
|      4 |      1 | Network Services  | modules/networks |
|      5 |      1 | Payroll           | modules/payroll  |
|      6 |      2 | Administration    | modules/admin    |
|      7 |      3 | Travel            | modules/travel   |
|      8 |      4 | Biographies       | modules/bios     |
+--------+--------+-------------------+------------------+

Here is my query to display the hierarchy:

WITH [root] AS
(
    SELECT      www.menuID
                ,CAST(www.title + ' (' + mi.modulename + ')' AS VARCHAR(200)) AS [path]
                ,CAST(mi.id AS VARCHAR(20)) AS ids
    FROM        [dbo].[module_info] AS mi
    INNER JOIN  [dbo].[WWW_Menu] www ON www.menuid = mi.menu_id
    WHERE       www.location LIKE 'modules/%'
UNION ALL
    SELECT      leaf.menuID
                ,CAST([root].[path] + ' > ' + leaf.title AS VARCHAR(200))
                ,CAST([root].ids + ',' + CONVERT(VARCHAR(4), mi.id) AS VARCHAR(20))
    FROM        [dbo].[WWW_Menu] AS leaf
    INNER JOIN  [root] ON leaf.parent = [root].menuID
    INNER JOIN  [dbo].[module_info] AS mi ON leaf.menuID = mi.menu_id
)
SELECT [path], ids FROM [root]

And the result from running on the two tables above:

+----------------------------------------+-------+
|                  path                  |  ids  |
+----------------------------------------+-------+
| Tests                                  | 1     |
| Testing Documents                      | 2     |
| Mailing Lists                          | 3     |
| Network Services                       | 4     |
| Payroll                                | 5     |
| Administration                         | 6     |
| Travel                                 | 7     |
| Biographies                            | 8     |
| Tests > Network Services               | 1,4   |
| Tests > Payroll                        | 1,5   |
| Testing Documents > Administration     | 2,6   |
| Mailing Lists > Travel                 | 3,7   |
| Network Services > Biographies         | 4,8   |
| Tests > Network Services > Biographies | 1,4,8 |
+----------------------------------------+-------+

I will be displaying these results in a select box for the purpose of searching for documents within the selected module. Here's the thing; Biographies, Network Services > Biographies, and Tests > Network Services > Biographies are the same module.

For each module, I only want to display the longest hierarchy string that ends in that module

+----------------------------------------+-------+
|                  path                  |  ids  |
+----------------------------------------+-------+
| Tests                                  | 1     |
| Testing Documents                      | 2     |
| Mailing Lists                          | 3     |
| Tests > Network Services               | 1,4   |
| Tests > Payroll                        | 1,5   |
| Testing Documents > Administration     | 2,6   |
| Mailing Lists > Travel                 | 3,7   |
| Tests > Network Services > Biographies | 1,4,8 |
+----------------------------------------+-------+

Maybe this isn't properly a SQL question, and is better accomplished by some sort of iteration server-side. However, a self-contained SQL solution for this problem would be my preferred option. Thanks.

Upvotes: 2

Views: 51

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

One method would involve a lot of string manipulation. Instead, as you go through the lists, keep track of the last module added to the list. Then use row_number() to get the longest for each of the "last" modules:

WITH [root] AS (
    SELECT www.menuID,
           CAST(www.title + ' (' + mi.modulename + ')' AS VARCHAR(200)) AS [path],
           CAST(mi.id AS VARCHAR(20)) AS ids,
           mi.id as lastId
    FROM [dbo].[module_info] mi INNER JOIN
         [dbo].[WWW_Menu] www 
         ON www.menuid = mi.menu_id
    WHERE www.location LIKE 'modules/%'
    UNION ALL
    SELECT leaf.menuID,
           CAST([root].[path] + ' > ' + leaf.title AS VARCHAR(200)),
           CAST([root].ids + ',' + CONVERT(VARCHAR(4), mi.id) AS VARCHAR(20)),
           mi.id
    FROM [dbo].[WWW_Menu] leaf INNER JOIN
         [root]
         ON leaf.parent = [root].menuID INNER JOIN
         [dbo].[module_info] mi
         ON leaf.menuID = mi.menu_id
    )
SELECT [path], ids
FROM (SELECT r.*,
             ROW_NUMBER() OVER (PARTITION BY lastId ORDER BY length(path) DESC) as seqnum
      FROM [root] r
     ) r
WHERE seqnum = 1;

Upvotes: 3

Related Questions