user384929
user384929

Reputation:

How can I build a nested tree with CTE from a spaced inner text?

Given the following table of a data (Imported data from Comma Delimited Components List)

[SQL CTE]
;WITH COMPONENTS(ID, TEXT, CHILD) AS 
(
   SELECT 
     ID,TEXT, (CASE WHEN PATINDEX(' %',TEXT)=1 THEN 1 ELSE 0 END)
   FROM tblComponents
) 
SELECT * FROM COMPONENTS

[TABLE DATA]
ID  |   TEXT        |   ISCHILD
------------------------------------------------
1   |   PARENT1     |   0
2   |   PARENT2     |   0
3   |     CHILD1    |   1
4   |     CHILD2    |   1

The spacing of the data identifies that the above CHILD1, CHILD2 are both children of PARENT2. AS they have spacing at their beginning and their order assumes that they are children of ID 2.

Is this possible; I've already run a query with PATINDEX(' %',TEXT)=1 to give me the ISCHILD.

Note: Assuming only Parent->ManyChildren (regardless of how many spaces are in TEXT) and no grandchildren; I am interested only in the initial relationship.

[ORIGINAL DATA]
ID,COMPONENT,
44,"COMPONENT1 contains:",
45,"  CHILD1 ",
46,"  CHILD2   ",
47,"  CHILD3 ",

Expected/Needed Output:

ID  |   TEXT        |   PARENT_ID
------------------------------------------------
1   |   PARENT1     |   0
2   |   PARENT2     |   0
3   |     CHILD1    |   2
4   |     CHILD2    |   2

Upvotes: 1

Views: 173

Answers (1)

JNevill
JNevill

Reputation: 50034

The big problem you are going to run into is that you are encoding information about your data in the ORDER of the records in your table. I can guarantee you that this will only lead to death and destruction.

Thankfully you do have an "ID" that appears to maintain the order of the data outside of the order that the records are written to the filesystem, so that's a step in the right direction. Furthermore you have identified children by looking for leading whitespace. So you are on the right track.

The next step is to determine which Parent the children belong to. You could do something like:

SELECT
    id,
    text,
    ischild,
    (
        SELECT TOP 1 id FROM Components WHERE isChild = 0 AND children.id > Components.id   ORDER BY id ASC
    ) AS parent_id

FROM components as children

If the hierarchy is deeper than just one level and the only way you can determine that a child is connected to a parent is by the spacing at the front then you could get more creative:

SELECT
    id,
    text,
    ischild,
    (
        SELECT TOP 1 id FROM Components
        WHERE 
            children.id > Components.id AND
            (len(Components.text) - len(replace(Components.text,' ',''))) < (len(children.text) - len(replace(children.text, ' ',''))) AND
            Components.isChild = 0
        ORDER BY id ASC
    ) AS parent_id

FROM components as children

With this last one we are moving through each record and then looking for the highest parent id that has:

  1. an ID that is less than the child record we are looking at
  2. has less leading spaces than the child record we are looking at

The recordset from this statement will now have the child's id and the parent's id. If the parent's id is NULL is then we know we are at a root node of the hierarchy.

From here you can build a recursive view to look at the hierarchical path, depth of node, and whatever else you want to squeeze out of a hierarchy:

WITH recursiveCTE
AS
(
    SELECT 
        id,
        text,
        parent_id,
        CAST(NULL as VARCHAR(500)) as path
        0 as depth
    FROM new_components_table
    WHERE parent_id IS NULL

    UNION ALL

    SELECT
        nct.id,
        nct.text,
        nct.parent_id,
        cte.path + '>' + nct.text as path,
        cte.depth + 1 as depth
    FROM
        recursiveCTE cte
        INNER JOIN new_components_table nct ON
            cte.id = nct.parent_id
)
SELECT id,text,parent_id, path, depth FROM recursiveCTE

Upvotes: 2

Related Questions