Reputation: 7260
Table:
create table tbl_nodess
(
nod1 varchar(50),
nod2 varchar(50)
);
Records:
insert into tbl_nodess values('A','B');
insert into tbl_nodess values('B','C');
insert into tbl_nodess values('C','B');
insert into tbl_nodess values('D','E');
insert into tbl_nodess values('E','F');
insert into tbl_nodess values('G','H');
insert into tbl_nodess values('B','D');
insert into tbl_nodess values('D','A');
insert into tbl_nodess values('I','J');
insert into tbl_nodess values('J','K');
insert into tbl_nodess values('K','L');
insert into tbl_nodess values('L','J');
Expected Result:
self_ref_nodes
------------------
A-B-C-B
A-B-D-A
I-J-K-L-J
Explanation about expected result: I want to find only those nodes which are self referencing.
Example:
A
referring B
,
B
referring C
,
C
referring B
The node C
again referring to node B
.
A
referring B
,
B
referring D
,
D
referring A
The node D
again referring to node A
.
I
referring J
, J
referring K
, K
referring L
, L
referring J
The node L
again referring to node J
(which is already in the chain).
Query:
;with cte AS (
select nod1, nod2,
convert(varchar(max), ('-'+ nod1+ '-'+ nod2+ '-')) as nodes, 1 as lev
from tbl_nodess n
where not exists(select 1 from tbl_nodess n1 where n.nod1 = n1.nod2)
union all
select cte.nod1, n.nod2,
convert(varchar(max), (cte.nodes+ n.nod2+ '-')) as nodes, lev + 1
from cte join
tbl_nodess n
on cte.nod2 = n.nod1
where nodes not like ('%-'+ n.nod2+ '-%')
)
select nodes
from cte
where not exists
(select 1
from cte cte2
where cte2.nodes like (cte.nodes+ '_%'))
Upvotes: 1
Views: 81
Reputation: 5403
I had a crack at this, but I had to make some major assumptions. First I stuck your data into a table variable:
DECLARE @tbl_nodess TABLE (nod1 VARCHAR(50), nod2 VARCHAR(50));
INSERT INTO @tbl_nodess VALUES('A','B');
INSERT INTO @tbl_nodess VALUES('B','C');
INSERT INTO @tbl_nodess VALUES('C','B');
INSERT INTO @tbl_nodess VALUES('D','E');
INSERT INTO @tbl_nodess VALUES('E','F');
INSERT INTO @tbl_nodess VALUES('G','H');
INSERT INTO @tbl_nodess VALUES('B','D');
INSERT INTO @tbl_nodess VALUES('D','A');
INSERT INTO @tbl_nodess VALUES('I','J');
INSERT INTO @tbl_nodess VALUES('J','K');
INSERT INTO @tbl_nodess VALUES('K','L');
INSERT INTO @tbl_nodess VALUES('L','J');
Then I wrote a recursive query:
WITH x AS (
SELECT
t.nod1,
t.nod2,
CONVERT(VARCHAR(500), t.nod1 + '-' + t.nod2) AS accumulator,
0 AS done
FROM
@tbl_nodess t
UNION ALL
SELECT
t.nod1,
t.nod2,
CONVERT(VARCHAR(500), x.accumulator + '-' + t.nod2) AS accumulator,
CASE WHEN CHARINDEX(t.nod2, x.accumulator) != 0 THEN 1 ELSE 0 END AS done
FROM
x
INNER JOIN @tbl_nodess t ON t.nod1 = x.nod2
WHERE
x.done = 0)
SELECT
accumulator AS self_ref_nodes
FROM
x
WHERE
done = 1;
So how does this work?
I have the recursive query, using the "X - Y -> Y - Z" rule. I also accumulate the chain, starting with nod1 - nod2, then adding any "new" nod2 when I find a match.
If I hit a loop then I want to stop, but I don't want to stop immediately. I want to stop on the next loop, so I use the "done" to handle this. I set it, and then stop when it's set to 1 on the next run through the recursion.
This prevents circular references, but if there were more elements then you might eventually hit the 100 limit.
It's also worth noting that I find a LOT of self-referencing loops, more than in your example:
L-J-K-L
K-L-J-K
J-K-L-J
I-J-K-L-J
D-A-B-D
D-A-B-C-B
B-D-A-B
C-B-C
C-B-D-A-B
B-C-B
A-B-D-A
A-B-C-B
Upvotes: 1