Reputation: 133
I have a problem when i use the path variables.
For example my hierarchy looks like
A-[Knows]-> B
B-[Knows]-> C
C-[Knows]-> D
D-[Knows]-> E
C-[Knows]-> F
F-[Knows]-> G
G-[Knows]-> B
My query is MATCH p = (x)<-[Knows*]->(y) Return p
The relation direction have to be bidirectional. Toy problem is like A,B but in real application, i don't know direction of relations.
So there is a cycle and loop. The path start to find from A to C. But after finding C, next steps, cycle is happened.
How to avoid the loop or how to ignore it, how to remove from actual path or how to stop when the path end in B. I don't need to find second B to C.
Expected result :
A->B->C->D->E
A->B->C->F->G->B // that is enough. The query must be stopped.
Actual result :
A->B->C->D->E
A->B->C->F->G->B->C->F->G->B..... loop trouble.
Upvotes: 6
Views: 4956
Reputation: 29172
How can we verify that the path includes a cycle?
A simple way in clear cypher it is to count the number of unique nodes of the path and compare it with the path length increased by one:
MATCH path = (x)-[:KNOWS*]-(y)
UNWIND NODES(path) AS n
WITH path,
SIZE(COLLECT(DISTINCT n)) AS testLength
WHERE testLength = LENGTH(path) + 1
RETURN path
You can simplify the query using the collection functions from the APOC library
.
For example:
MATCH path = (x)-[:KNOWS*]-(y)
WHERE SIZE(apoc.coll.toSet(NODES(path))) > LENGTH(path) + 1
RETURN path
Or we can simplify:
MATCH path = (x)-[:KNOWS*]-(y)
WHERE apoc.coll.duplicates(NODES(path)) = []
RETURN path
Upvotes: 5
Reputation: 11216
Here is an example that would get at your desired result.
A few things to note. I started with node A rather than apply the pattern across the entire graph. In order to get your desired result with cypher I queried the relationship KNOWS
as directed. If you remove the direction then you get more results which I am not sure are incorrect. Anyways, you can decide for yourself.
The query finds all of the paths and then factors out the ones that are included in the longest distinct paths.
First I started with a little data...
CREATE (A:Person {name: 'A'})
CREATE (B:Person {name: 'B'})
CREATE (C:Person {name: 'C'})
CREATE (D:Person {name: 'D'})
CREATE (E:Person {name: 'E'})
CREATE (F:Person {name: 'F'})
CREATE (G:Person {name: 'G'})
CREATE (A)-[:KNOWS]->(B)
CREATE (B)-[:KNOWS]->(C)
CREATE (C)-[:KNOWS]->(D)
CREATE (D)-[:KNOWS]->(E)
CREATE (C)-[:KNOWS]->(F)
CREATE (F)-[:KNOWS]->(G)
CREATE (G)-[:KNOWS]->(B)
RETURN *
Here is the query
// find all the variable length paths starting with A
MATCH path = (x:Person)-[:KNOWS*]->(y:Person)
WHERE x.name = 'A'
// collect the paths and collect the paths excluding the last node
WITH COLLECT(path) AS paths,
COLLECT(DISTINCT nodes(path)[0..size(nodes(path))-1]) AS all_but_last_nodes_of_paths
// filter the paths out that are already included (i.e. shorter ones)
WITH [p IN paths WHERE NOT nodes(p) IN all_but_last_nodes_of_paths] AS paths_to_keep
// return each path to keep
UNWIND paths_to_keep AS path
WITH reduce(path_txt = "", n in nodes(path) | path_txt + n.name + "->") AS path_txt
RETURN left(path_txt,size(path_txt)-2) AS path_txt
Here is the same query solved also with APOC instead of just cypher using expandConfig. Note that it also relies on specifying the direction when querying to get your desired result set.
MATCH (x:Person)
WHERE x.name = 'A'
WITH x
CALL apoc.path.expandConfig(x, {
relationshipFilter:'KNOWS>',
labelFilter:'+Person',
uniqueness: 'RELATIONSHIP_PATH'
} ) YIELD path AS path
WITH COLLECT(path) AS paths,
COLLECT(DISTINCT nodes(path)[0..size(nodes(path))-1]) AS all_but_last_nodes_of_paths
WITH [p IN paths WHERE NOT nodes(p) IN all_but_last_nodes_of_paths] AS paths_to_keep
UNWIND paths_to_keep AS path
WITH reduce(path_txt = "", n in nodes(path) | path_txt + n.name + "->") AS path_txt
RETURN left(path_txt,size(path_txt)-2) AS path_txt
Upvotes: 0
Reputation: 7458
Firstly, to avoid loops, cypher has the following mecanism : in a pattern, you can only traverse once a relationship.
The problem of your query is that you are searching all the paths between all your nodes (that's why it takes some times).
If you only want to search the shortespath between two node, you can use the cypher's shortespath function : MATCH p = shortestpath((x)-[Knows*]-(y)) RETURN p
.
Moreover, if you now the ending node, you should tell it to cypher :
MATCH p = shortestpath((x)-[Knows*]-(y))
WHERE x.name = 'A' AND y.name = 'B'
RETURN p
Cheers
Upvotes: 2