McFixit
McFixit

Reputation: 436

SQL Server CTE to find path from one ID to another ID

I have a table in a SQL Server 2008 R2 database that defines transitions between different states.

Example relevant table columns:

TransitionID | SourceStateID | DestinationStateID | WorkflowID
--------------------------------------------------------------
1            | 17            | 18                 | 3
2            | 17            | 21                 | 3
3            | 17            | 24                 | 3
4            | 17            | 172                | 3
5            | 18            | 17                 | 3
6            | 18            | 24                 | 3
7            | 18            | 31                 | 3
8            | 21            | 17                 | 3
9            | 21            | 20                 | 3
10           | 21            | 141                | 3
11           | 21            | 184                | 3

... etc.

My goal is to be able to define a starting StateID, an ending StateID, and the WorkflowID, and hopefully find a logical path from the starting StateID to the ending StateID within that workflow.

e.g. For the table above, if I supplied a starting StateID of 17, and ending StateID of 184, and a WorkflowID of 3, I could wind up with a 'path' of 17>21>184 (or more ideally, TransitionID 2 > TransitionID 11)

The Good:

The Bad:

I realize it's some manner of CTE I'm after, but admittedly I find CTE's hard to grasp in general, and doubly so when circular references are a guaranteed problem.

The perfect solution would be one that selects the shortest path from the starting StateID to the ending StateID, but to be honest at this point if I can get something working that will reliably give me any valid path between the two states I'll be so happy.

Any SQL gurus out there have some direction you could point me in? I honestly don't even know where to begin to prevent circular issues like getting a path along the lines of 17>18>17>18>17>18...

DISGUSTING UPDATE I came up with this query which hurts me on an emotional level (with some help from this post: https://stackoverflow.com/a/11042012/3253311), but appears to be working.

DECLARE @sourceStatusId INT = 17
DECLARE @destinationStatusId INT = 24
DECLARE @workflowId INT = 3

DECLARE @sourceStatusIdString NVARCHAR(MAX) = CAST(@sourceStatusId AS             NVARCHAR(MAX))
DECLARE @destinationStatusIdString NVARCHAR(MAX) = CAST(@destinationStatusId     AS NVARCHAR(MAX))
DECLARE @workflowIdString NVARCHAR(MAX) = CAST(@workflowId AS NVARCHAR(MAX))

;WITH CTE ([Source], [Destination], [Sentinel]) AS 
(
SELECT
    CAST(t.[Source] AS NVARCHAR(MAX)) AS [Source],
    CAST(t.[Destination] AS NVARCHAR(MAX)) AS [Destination],
    [Sentinel] = CAST([Source] AS NVARCHAR(MAX))
FROM
    Transitions t
WHERE
    CAST([Source] AS NVARCHAR(MAX)) = @sourceStatusIdString AND 
    CAST([WorkflowID] AS NVARCHAR(MAX)) = @workflowIdString

UNION ALL

SELECT
    CAST(CTE.[Destination] AS NVARCHAR(MAX)),
    CAST(t.[Destination] AS NVARCHAR(MAX)),
    CAST([Sentinel] AS NVARCHAR(MAX)) + CAST('|' AS NVARCHAR(MAX)) + CAST(CTE.[Destination] AS NVARCHAR(MAX))
FROM
    CTE
    JOIN Transitions t
        ON CAST(t.[Source] AS NVARCHAR(MAX)) = CAST(CTE.[Destination] AS     NVARCHAR(MAX))
WHERE
    CHARINDEX(CTE.[Destination], Sentinel)=0
)

SELECT TOP 1
[Sentinel]
FROM
CTE
WHERE
LEFT([Sentinel], LEN(@sourceStatusIdString)) = @sourceStatusIdString AND 
RIGHT([Sentinel], LEN(@destinationStatusIdString)) =     @destinationStatusIdString 
ORDER BY
LEN([Sentinel])

Upvotes: 2

Views: 349

Answers (3)

Kateract
Kateract

Reputation: 852

Similar to Quassnoi's answer, but prevents circular references that start after the first element:

DECLARE @src int = 18, @dst int = 184;
WITH cte  AS 
(Select TransitionId, SourceStateId, DestinationStateID, SourceStateID as StartingState, 0 as Depth
  , cast(TransitionId as varchar(max)) + ','  as IDPath  
  , cast(SourceStateID as varchar(max)) + ',' as States
 FROM Transitions
 WHERE SourceStateId = @src
 UNION ALL
 Select t.TransitionId, t.SourceStateId, t.DestinationStateID, StartingState, cte.Depth + 1
  , cte.IDPath + cast(t.TransitionId as varchar(max)) + ',' 
  , cte.States + cast(t.SourceStateID as varchar(max)) + ','
 FROM cte JOIN Transitions t on cte.DestinationStateID = t.SourceStateId 
 and CHARINDEX(',' + cast(t.SourceStateID as varchar(max)) + ',', States) = 0 --prevent loop starting after first element
 and t.DestinationStateID <> StartingState --prevent loop starting with first element
 where cte.Depth < 10 -- prevent going too deep
 )
 select TransitionId, SourceStateId, DestinationStateID, Depth, left(IDPath, len(IDPath) - 1) IDPath, left(States, len(States) - 1) States
 from cte
 where DestinationStateID = @dst
 order by Depth

Upvotes: 3

Quassnoi
Quassnoi

Reputation: 425643

WITH    q (id, src, dst, sid, cnt, path) AS
        (
        SELECT  transitionId, sourceStateId, destinationStateId, sourceStateId, 1,
                CAST(transitionId AS VARCHAR(MAX)) path
        FROM    mytable
        WHERE   sourceStateId = 18
        UNION ALL
        SELECT  m.transitionId, m.sourceStateId, m.destinationStateId,
                CASE WHEN sid < sourceStateId THEN sid ELSE sourceStateId END, cnt + 1,
                path + ', ' + CAST(transitionId AS VARCHAR(MAX))
        FROM    q
        JOIN    mytable m
        ON      m.sourceStateId = q.dst
                AND m.sourceStateId <> q.sid
        )
SELECT  TOP 1
        *
FROM    q
WHERE   dst = 184
ORDER BY
        cnt DESC

See the fiddle: http://www.sqlfiddle.com/#!6/9342e/17

Upvotes: 2

Spock
Spock

Reputation: 4910

Here's my take on it... Building on Quassnoi's, but shorter than Kateract ;-)

http://www.sqlfiddle.com/#!6/322d3/4

WITH data AS (
    SELECT  TransitionId, SourceStateId, DestinationStateId,
        '|' + CAST(transitionId AS VARCHAR(MAX)) as Path
    FROM    States
    WHERE   sourceStateId = 17
    AND WorkflowID = 3
    UNION ALL
    SELECT  m.TransitionId, m.sourceStateId, m.DestinationStateId,
        Path + '|' + CAST(m.TransitionId AS VARCHAR(MAX))
    FROM data d
        JOIN States m ON
            m.sourceStateId = d.DestinationStateId
        AND CHARINDEX( '|' + convert(varchar(10),m.TransitionID) + '|', path) = 0
        AND WorkflowID = 3
)
SELECT TOP 1 *
FROM data
WHERE DestinationStateId = 184
ORDER BY LEN(Path)
-- If you want the longest path... ORDER BY LEN(Path) DESC

Upvotes: 0

Related Questions