Reputation: 981
Consider the following directed graph:
I'd like to be able to pick two nodes in this graph, and then find all of the possible paths between those two nodes. (Well, not all of the paths, just those paths not containing cycles.)
First, I model the graph in a view (as a transitive closure)
CREATE OR REPLACE VIEW closure AS
WITH
edges AS (
-- for demonstration purposes, I just need to model the edges
SELECT 'A' AS start_node, 'C' AS end_node FROM dual UNION ALL
SELECT 'A', 'D' FROM dual UNION ALL
SELECT 'B', 'C' FROM dual UNION ALL
SELECT 'B', 'D' FROM dual UNION ALL
SELECT 'C', 'E' FROM dual UNION ALL
SELECT 'D', 'F' FROM dual UNION ALL
SELECT 'E', 'A' FROM dual UNION ALL
SELECT 'E', 'F' FROM dual UNION ALL
SELECT 'E', 'G' FROM dual UNION ALL
SELECT 'F', 'B' FROM dual UNION ALL
SELECT 'F', 'G' FROM dual
)
SELECT
CONNECT_BY_ROOT start_node AS start_node,
end_node AS end_node,
CONNECT_BY_ROOT start_node||sys_connect_by_path(end_node, ' -> ') AS path,
level + 1 AS path_length
FROM
edges
CONNECT BY
NOCYCLE PRIOR end_node = start_node
-- this bit just adds the trivial paths to the closure
UNION ALL
SELECT DISTINCT start_node, start_node, start_node, 1 FROM edges
;
view CLOSURE created.
Now, suppose that I wanted to see all of the paths that start at 'E' and end at 'G'.
SELECT path, path_length
FROM closure
WHERE start_node='E'
AND end_node = 'G'
ORDER BY path_length ASC
;
PATH PATH_LENGTH
------------------------------ -----------
E -> G 2
E -> F -> G 3
E -> A -> C -> E -> G 5
E -> A -> D -> F -> G 5
E -> F -> B -> C -> E -> G 6
E -> A -> C -> E -> F -> G 6
E -> A -> D -> F -> B -> C -> E -> G 8
7 rows selected
I am surprised the the third, fifth, sixth, and seventh rows are returned. These four rows all contain a cycle (specifically, from 'E' to 'E').
Looking at the data in my view, it appears that the only cycles that are included are those that include the root node. (e.g., the cycle 'E' -> 'A' -> 'C' -> 'E') is only included in paths that begin with node 'E'... it is never included in paths that begin with any other node.)
I have tried to prevent these cycles from happening by adding a CONNECT_BY_ROOT start_node <> start_node
in my "CONNECT BY" clause, but this is not allowed by Oracle.
How can I prevent these cycles from occurring in my data?
==== BEGIN EDIT ====
As @Egor Skriptunoff pointed out below, I am working with a table of edges, so maybe I should expect to see some vertices repeated, as long as they are not visited over the same edge. However, if this were the case, I should see the path 'E -> F -> B -> D -> F -> G' in the results above. This is a path that cycles through node 'F', once via edge 'E -> F' and once via 'D -> F'. However, this path is not there.
I presume this is because the NOCYCLE is preventing node 'F' from occurring twice. Is this really the case?
Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
==== END EDIT ====
Upvotes: 3
Views: 1014
Reputation: 17258
Your select clause is flawed - you have to apply CONNECT_BY_ROOT to the head of the edge instead of its tail
-- ...
CONNECT_BY_ROOT end_node AS start_node,
end_node AS end_node,
sys_connect_by_path(end_node, ' -> ') AS path,
level AS path_length
-- ...
Upvotes: 1
Reputation: 23767
You are working with a table of edges, so it is repeated edges that are excluded, vertices may repeat.
Upvotes: 1