Reputation: 1629
This is unrelated to how to find all the longest paths with cypher query?, and somewhat related to the answer in Find all relationship disjoint longest paths in cypher/traversal API ordered by size except my condition is a bit different:
I would like to write a cypher query that returns paths whose collections of nodes are "unique", in the sense that no two nodes in the path share the same name
property.
and the cypher to make it:
CREATE (a1:Node {name: 'A', time:1}),
(c1:Node {name: 'C', time:2}),
(b1:Node {name: 'B', time:3}),
(d1:Node {name: 'D', time:4}),
(c2:Node {name: 'C', time:5}),
(a2:Node {name: 'A', time:6}),
(a3:Node {name: 'A', time:7}),
(b2:Node {name: 'B', time:8}),
(d2:Node {name: 'D', time:9})
CREATE (a1)-[:NEXT]->(b1)-[:NEXT]->(c2)-[:NEXT]->(a2)-[:NEXT]->(b2),
(a2)-[:NEXT]->(a3)-[:NEXT]->(d2),
(c1)-[:NEXT]->(b1),
(d1)-[:NEXT]->(c2)
RETURN *
In this graph, the following paths are considered longest, and unique:
(a1)-->(b1)-->(c2)
(c1)-->(b1)
(d1)-->(c2)-->(a2)-->(b2)
(a3)-->(d2)
Some example of paths that should not be returned from the query are
(c1)-->(b1)-->(c2)
since it contains two instances of nodes with name, "C" (a2)-->(b2)
since it is contained in the larger path (d1)-->(c2)-->(a2)-->(b2)
Any help would be much appreciated.
Upvotes: 2
Views: 1646
Reputation: 29172
//
// Find all path:
MATCH path = (a:Node)-[:NEXT*1..]->(b:Node)
//
// Check that the names of the nodes in a path is unique
UNWIND nodes(path) as node
WITH path, nodes(path) as nodes,
collect(DISTINCT node.name) as unames
//
// And sort by path length
ORDER BY length(path) ASC
WHERE length(path)+1 = size(unames)
//
// Putting the appropriate path to the collection
WITH collect({f: id(HEAD(nodes)), // Fist node in path
l: id(LAST(nodes)), // Last node in path
ns: EXTRACT(n in nodes(path) | id(n)),
p: path
}) + {f: NULL, l: NULL, ns: [NULL]} as paths
//
// Loop through the paths in a double loop
// and check that the start and end nodes
// are not included in the following ascending path
UNWIND RANGE(0,size(paths)-2) as i
UNWIND RANGE(i+1,size(paths)-1) as j
WITH i, paths, paths[i]['p'] as path,
sum( size( FILTER(n in paths[j]['ns'] WHERE n=paths[i]['f']) )) as fc,
sum( size( FILTER(n in paths[j]['ns'] WHERE n=paths[i]['l']) )) as fl
WHERE fl=0 OR fc=0
//
// Return all appropriate ways
RETURN path ORDER BY length(path)
Upd: (let's try to add elegance)
//
// Find all path:
MATCH path = (a:Node)-[:NEXT*1..]->(b:Node)
//
// Check that the names of the nodes in a path is unique
UNWIND nodes(path) as node
WITH a, b, path,
collect(DISTINCT node.name) as unames
//
// And sort by path length
ORDER BY length(path) ASC
WHERE length(path)+1 = size(unames)
//
// Putting the appropriate path and first and last nodes to the collection
WITH collect({first: a, last: b, path: path}) + {} as paths
//
// Loop through the paths in a double loop
// and check that the start and end nodes
// are not included in the following ascending path
UNWIND RANGE(0,size(paths)-2) as i
WITH i, paths[i]['path'] as path, paths
UNWIND RANGE(i+1,size(paths)-1) as j
WITH path,
collect(distinct paths[i]['first'] IN nodes(paths[j]['path'])) as cFirst,
collect(distinct paths[i]['last'] IN nodes(paths[j]['path'])) as cLast
WHERE (size(cFirst)=1 AND cFirst[0] = false) OR
(size(cLast )=1 AND cLast [0] = false)
//
// Return all appropriate ways
RETURN path ORDER BY length(path)
Upvotes: 4