Reputation: 11788
I need to implement a multi-parented tree (or digraph) onto SQL Server 2005. I've read several articles, but most of them uses single-parented trees with a unique root like the following one.
-My PC
-Drive C
-Documents and Settings
-Program Files
-Adobe
-Microsoft
-Folder X
-Drive D
-Folder Y
-Folder Z
In this one, everything derives from a root element (My PC).
In my case, a child could have more than 1 parent, like the following:
G A
\ /
B
/ \
X C
/ \
D E
\ /
F
So I have the following code:
create table #ObjectRelations
(
Id varchar(20),
NextId varchar(20)
)
insert into #ObjectRelations values ('G', 'B')
insert into #ObjectRelations values ('A', 'B')
insert into #ObjectRelations values ('B', 'C')
insert into #ObjectRelations values ('B', 'X')
insert into #ObjectRelations values ('C', 'E')
insert into #ObjectRelations values ('C', 'D')
insert into #ObjectRelations values ('E', 'F')
insert into #ObjectRelations values ('D', 'F')
declare @id varchar(20)
set @id = 'A';
WITH Objects (Id, NextId) AS
( -- This is the 'Anchor' or starting point of the recursive query
SELECT rel.Id,
rel.NextId
FROM #ObjectRelations rel
WHERE rel.Id = @id
UNION ALL -- This is the recursive portion of the query
SELECT rel.Id,
rel.NextId
FROM #ObjectRelations rel
INNER JOIN Objects -- Note the reference to CTE table name (Recursive Join)
ON rel.Id = Objects.NextId
)
SELECT o.*
FROM Objects o
drop table #ObjectRelations
Which returns the following SET:
Id NextId
-------------------- --------------------
A B
B C
B X
C E
C D
D F
E F
Expected result SET:
Id NextId
-------------------- --------------------
G B
A B
B C
B X
C E
C D
D F
E F
Note that the relation G->B is missing, because it asks for an starting object (which doesn't work for me also, because I don't know the root object from the start) and using A as the start point will ignore the G->B relationship.
So, this code doesn't work in my case because it asks for a starting object, which is obvious in a SINGLE-parent tree (will always be the root object). But in multi-parent tree, you could have more than 1 "root" object (like in the example, G and A are the "root" objects, where root is an object which doesn't have a parent (ancestor)).
So I'm kind of stucked in here... I need to modify the query to NOT ask for a starting object and recursively traverse the entire tree. I don't know if that's possible with the (Id, NextId) implementation... may be I need to store it like a graph using some kind of Incidence matrix, adjacency matrix or whatever (see http://willets.org/sqlgraphs.html).
Any help? What do you think guys? Thank you very much for your time =)
Cheers!
Sources: Source 1 Source 2 Source 3
Upvotes: 2
Views: 6836
Reputation: 324
If you dont want to do the inserts suggested by Ronald,this would do!.
WITH CTE_MultiParent (ID, ParentID)
AS
(
SELECT ID, ParentID FROM #ObjectRelations
WHERE ID NOT IN
(
SELECT DISTINCT ParentID FROM #ObjectRelations
)
UNION ALL
SELECT ObjR.ID, ObjR.ParentID FROM #ObjectRelations ObjR INNER JOIN CTE_MultiParent
ON CTE_MultiParent.ParentID = ObjR.Id
)
SELECT DISTINCT * FROM CTE_MultiParent
Upvotes: 0
Reputation: 11788
Well, I finally came up with the following solution. It's the way I found to support multi-root trees and also cycling digraphs.
create table #ObjectRelations
(
Id varchar(20),
NextId varchar(20)
)
/* Cycle */
/*
insert into #ObjectRelations values ('A', 'B')
insert into #ObjectRelations values ('B', 'C')
insert into #ObjectRelations values ('C', 'A')
*/
/* Multi root */
insert into #ObjectRelations values ('G', 'B')
insert into #ObjectRelations values ('A', 'B')
insert into #ObjectRelations values ('B', 'C')
insert into #ObjectRelations values ('B', 'X')
insert into #ObjectRelations values ('C', 'E')
insert into #ObjectRelations values ('C', 'D')
insert into #ObjectRelations values ('E', 'F')
insert into #ObjectRelations values ('D', 'F')
declare @startIds table
(
Id varchar(20) primary key
)
;WITH
Ids (Id) AS
(
SELECT Id
FROM #ObjectRelations
),
NextIds (Id) AS
(
SELECT NextId
FROM #ObjectRelations
)
INSERT INTO @startIds
/* This select will not return anything since there are not objects without predecessor, because it's a cyclic of course */
SELECT DISTINCT
Ids.Id
FROM
Ids
LEFT JOIN
NextIds on Ids.Id = NextIds.Id
WHERE
NextIds.Id IS NULL
UNION
/* So let's just pick anyone. (the way I will be getting the starting object for a cyclic doesn't matter for the regarding problem)*/
SELECT TOP 1 Id FROM Ids
;WITH Objects (Id, NextId, [Level], Way) AS
( -- This is the 'Anchor' or starting point of the recursive query
SELECT rel.Id,
rel.NextId,
1,
CAST(rel.Id as VARCHAR(MAX))
FROM #ObjectRelations rel
WHERE rel.Id IN (SELECT Id FROM @startIds)
UNION ALL -- This is the recursive portion of the query
SELECT rel.Id,
rel.NextId,
[Level] + 1,
RecObjects.Way + ', ' + rel.Id
FROM #ObjectRelations rel
INNER JOIN Objects RecObjects -- Note the reference to CTE table name (Recursive Join)
ON rel.Id = RecObjects.NextId
WHERE RecObjects.Way NOT LIKE '%' + rel.Id + '%'
)
SELECT DISTINCT
Id,
NextId,
[Level]
FROM Objects
ORDER BY [Level]
drop table #ObjectRelations
Could be useful for somebody. It is for me =P Thanks
Upvotes: 2
Reputation: 32094
If you want to use all root objects as starting objects, you should first update your data to include information about the root objects (and the leaves). You should add the following inserts:
insert into #ObjectRelations values (NULL, 'G')
insert into #ObjectRelations values (NULL, 'A')
insert into #ObjectRelations values ('X', NULL)
insert into #ObjectRelations values ('F', NULL)
Of course you could also write your anchor query in such a way that you select as root nodes the records that have an Id
that does not occur as a NextId
, but this is easier.
Next, modify your anchor query to look like this:
SELECT rel.Id,
rel.NextId
FROM #ObjectRelations rel
WHERE rel.Id IS NULL
If you run this query, you'll see that you get a lot of duplicates, a lot of arcs occur multiple times. This is because you now have two results from your anchor query and therefore the tree is traversed two times.
This can be fixed by changing your select statement to this (note the DISTINCT
):
SELECT DISTINCT o.*
FROM Objects o
Upvotes: 1