Reputation: 436
I have a table in a SQL Server 2008 R2 database that defines transitions between different states.
Example relevant table columns:
TransitionID | SourceStateID | DestinationStateID | WorkflowID
--------------------------------------------------------------
1 | 17 | 18 | 3
2 | 17 | 21 | 3
3 | 17 | 24 | 3
4 | 17 | 172 | 3
5 | 18 | 17 | 3
6 | 18 | 24 | 3
7 | 18 | 31 | 3
8 | 21 | 17 | 3
9 | 21 | 20 | 3
10 | 21 | 141 | 3
11 | 21 | 184 | 3
... etc.
My goal is to be able to define a starting StateID, an ending StateID, and the WorkflowID, and hopefully find a logical path from the starting StateID to the ending StateID within that workflow.
e.g. For the table above, if I supplied a starting StateID of 17, and ending StateID of 184, and a WorkflowID of 3, I could wind up with a 'path' of 17>21>184 (or more ideally, TransitionID 2 > TransitionID 11)
The Good:
The Bad:
There are certainly circular references on virtually every source/destination StateID (i.e. there's likely a transition from SourceStateID 1 to DestinationStateID 2, and also from SourceStateID 2 to DestinationStateID 1
There are certainly multiple possible paths from virtually any defined starting StateID and ending StateID
I realize it's some manner of CTE I'm after, but admittedly I find CTE's hard to grasp in general, and doubly so when circular references are a guaranteed problem.
The perfect solution would be one that selects the shortest path from the starting StateID to the ending StateID, but to be honest at this point if I can get something working that will reliably give me any valid path between the two states I'll be so happy.
Any SQL gurus out there have some direction you could point me in? I honestly don't even know where to begin to prevent circular issues like getting a path along the lines of 17>18>17>18>17>18...
DISGUSTING UPDATE I came up with this query which hurts me on an emotional level (with some help from this post: https://stackoverflow.com/a/11042012/3253311), but appears to be working.
DECLARE @sourceStatusId INT = 17
DECLARE @destinationStatusId INT = 24
DECLARE @workflowId INT = 3
DECLARE @sourceStatusIdString NVARCHAR(MAX) = CAST(@sourceStatusId AS NVARCHAR(MAX))
DECLARE @destinationStatusIdString NVARCHAR(MAX) = CAST(@destinationStatusId AS NVARCHAR(MAX))
DECLARE @workflowIdString NVARCHAR(MAX) = CAST(@workflowId AS NVARCHAR(MAX))
;WITH CTE ([Source], [Destination], [Sentinel]) AS
(
SELECT
CAST(t.[Source] AS NVARCHAR(MAX)) AS [Source],
CAST(t.[Destination] AS NVARCHAR(MAX)) AS [Destination],
[Sentinel] = CAST([Source] AS NVARCHAR(MAX))
FROM
Transitions t
WHERE
CAST([Source] AS NVARCHAR(MAX)) = @sourceStatusIdString AND
CAST([WorkflowID] AS NVARCHAR(MAX)) = @workflowIdString
UNION ALL
SELECT
CAST(CTE.[Destination] AS NVARCHAR(MAX)),
CAST(t.[Destination] AS NVARCHAR(MAX)),
CAST([Sentinel] AS NVARCHAR(MAX)) + CAST('|' AS NVARCHAR(MAX)) + CAST(CTE.[Destination] AS NVARCHAR(MAX))
FROM
CTE
JOIN Transitions t
ON CAST(t.[Source] AS NVARCHAR(MAX)) = CAST(CTE.[Destination] AS NVARCHAR(MAX))
WHERE
CHARINDEX(CTE.[Destination], Sentinel)=0
)
SELECT TOP 1
[Sentinel]
FROM
CTE
WHERE
LEFT([Sentinel], LEN(@sourceStatusIdString)) = @sourceStatusIdString AND
RIGHT([Sentinel], LEN(@destinationStatusIdString)) = @destinationStatusIdString
ORDER BY
LEN([Sentinel])
Upvotes: 2
Views: 349
Reputation: 852
Similar to Quassnoi's answer, but prevents circular references that start after the first element:
DECLARE @src int = 18, @dst int = 184;
WITH cte AS
(Select TransitionId, SourceStateId, DestinationStateID, SourceStateID as StartingState, 0 as Depth
, cast(TransitionId as varchar(max)) + ',' as IDPath
, cast(SourceStateID as varchar(max)) + ',' as States
FROM Transitions
WHERE SourceStateId = @src
UNION ALL
Select t.TransitionId, t.SourceStateId, t.DestinationStateID, StartingState, cte.Depth + 1
, cte.IDPath + cast(t.TransitionId as varchar(max)) + ','
, cte.States + cast(t.SourceStateID as varchar(max)) + ','
FROM cte JOIN Transitions t on cte.DestinationStateID = t.SourceStateId
and CHARINDEX(',' + cast(t.SourceStateID as varchar(max)) + ',', States) = 0 --prevent loop starting after first element
and t.DestinationStateID <> StartingState --prevent loop starting with first element
where cte.Depth < 10 -- prevent going too deep
)
select TransitionId, SourceStateId, DestinationStateID, Depth, left(IDPath, len(IDPath) - 1) IDPath, left(States, len(States) - 1) States
from cte
where DestinationStateID = @dst
order by Depth
Upvotes: 3
Reputation: 425643
WITH q (id, src, dst, sid, cnt, path) AS
(
SELECT transitionId, sourceStateId, destinationStateId, sourceStateId, 1,
CAST(transitionId AS VARCHAR(MAX)) path
FROM mytable
WHERE sourceStateId = 18
UNION ALL
SELECT m.transitionId, m.sourceStateId, m.destinationStateId,
CASE WHEN sid < sourceStateId THEN sid ELSE sourceStateId END, cnt + 1,
path + ', ' + CAST(transitionId AS VARCHAR(MAX))
FROM q
JOIN mytable m
ON m.sourceStateId = q.dst
AND m.sourceStateId <> q.sid
)
SELECT TOP 1
*
FROM q
WHERE dst = 184
ORDER BY
cnt DESC
See the fiddle: http://www.sqlfiddle.com/#!6/9342e/17
Upvotes: 2
Reputation: 4910
Here's my take on it... Building on Quassnoi's, but shorter than Kateract ;-)
http://www.sqlfiddle.com/#!6/322d3/4
WITH data AS (
SELECT TransitionId, SourceStateId, DestinationStateId,
'|' + CAST(transitionId AS VARCHAR(MAX)) as Path
FROM States
WHERE sourceStateId = 17
AND WorkflowID = 3
UNION ALL
SELECT m.TransitionId, m.sourceStateId, m.DestinationStateId,
Path + '|' + CAST(m.TransitionId AS VARCHAR(MAX))
FROM data d
JOIN States m ON
m.sourceStateId = d.DestinationStateId
AND CHARINDEX( '|' + convert(varchar(10),m.TransitionID) + '|', path) = 0
AND WorkflowID = 3
)
SELECT TOP 1 *
FROM data
WHERE DestinationStateId = 184
ORDER BY LEN(Path)
-- If you want the longest path... ORDER BY LEN(Path) DESC
Upvotes: 0