Mahesha999
Mahesha999

Reputation: 24721

Cypher to get "unique longest simple (acyclic)" paths from given source node

I have this simple graph:

create (a:ent {id:'a'})-[:rel]->(:ent {id:'b'})-[:rel]->(c:ent {id:'c'})-[:rel]->(d:ent {id:'d'})-[:rel]->(:ent {id:'e'})-[:rel]->(a),
   (d)-[:rel]->(c),
   (c)-[:rel]->(f:ent {id:'f'})-[:rel]->(g:ent {id:'g'})-[:rel]->(a),
   (g)-[:rel]->(f)

It looks like this:
enter image description here

Given is 'a', that is node (:ent {id:'a'}), I want to write query that returns "exactly two unique longest" paths:

a->b->c->d->e
a->b->c->f->g

As you can see here I should be considering cycles in the graph here. It seems that query suggested here should be fine, which I rewrote as follows:

MATCH path=(:ent{id:'a'})-[:rel*]->(:ent)
WHERE ALL(n in nodes(path)
      WHERE 1=size(filter(m in nodes(path)WHERE m.id=n.id))
     )
RETURN path

I know the query does not give exact result as I intend to get, however if I understand the logic correctly, it does at least avoid the cyclic paths. I felt this query can be a good starting point. But it is giving me weird error:

key not found:   UNNAMED26

Whats wrong here? I am unable to pinpoint the error in the cypher with this unclear error description.

Update

I tried a new simpler query:

MATCH path=(s:ent{id:'a'})-[:rel*]->(d:ent)
WHERE not (d)-[:rel]->() OR (d)-[:rel]->(s)
RETURN extract(x IN nodes(path)| x.id) as result

It returns:

╒═════════════════════╕
│result               │
╞═════════════════════╡
│[a, b, c, d, e]      │
├─────────────────────┤
│[a, b, c, d, c, f, g]│
├─────────────────────┤
│[a, b, c, f, g]      │
└─────────────────────┘

As you can see it has one redundant path [a, b, c, d, c, f, g] caused due to the cycle (d)<->(c) cycle. I honestly feel the original/first query in this post should eliminate it. Can someone please tell me how can I make it work...?

Upvotes: 2

Views: 497

Answers (1)

Dave Bennett
Dave Bennett

Reputation: 11216

You can use the apoc.path.expandConfig to find the paths and then just filter out the shortest ones leaving only the longest.

// start with the a node
MATCH (a:ent {id :"a"})

// use expandConfig with the relationship type and direction
// use uniqueness of NODE_PATH to ensure that you don't backtrack
// optionally include a minlevel, maxlevel
CALL apoc.path.expandConfig(a, 
{
  relationshipFilter: 'rel>',
  uniqueness: 'NODE_PATH',
  minLevel: 2,
  maxLevel: 10
} ) yield path

// collect the paths and the get the max length
WITH COLLECT(path) AS paths, MAX(length(path)) AS longest_length

// remove any of the paths that are not the max length
WITH FILTER(path IN paths WHERE length(path)= longest_length) AS longest_paths

// return each path matching the max length
UNWIND longest_paths AS path
RETURN path

Updated: Alternate answer based on clarification from OP.

// alternate set of test data based on OP's comments
MERGE (a:ent {id:'a'})-[:rel]->(b:ent {id:'b'})-[:rel]->(c:ent {id:'c'})
MERGE (b)-[:rel]->(d:ent {id:'d'})
MERGE (b)-[:rel]->(e:ent {id:'e'})-[:rel]->(f:ent {id:'f'})
RETURN *

And the updated query

// start with the a node
MATCH (a:ent {id :"a"})

// use expandConfig with the relationship type and direction
// use uniqueness of NODE_PATH to ensure that you don't backtrack
// optionally include a minlevel, maxlevel
CALL apoc.path.expandConfig(a, 
{
  relationshipFilter: 'rel>',
  uniqueness: 'NODE_PATH',
  minLevel: 1,
  maxLevel: 10
} ) yield path

// create two collections: one of nodes of full paths and the other nodes of paths shorted by one
WITH COLLECT(path) AS paths,
     COLLECT(DISTINCT nodes(path)[0..size(nodes(path))-1]) AS all_but_last_nodes_of_paths

// filter out the nodes of paths that match nodes in the shorted list
// i.e. the paths that are already included in another list
WITH [p IN paths WHERE NOT nodes(p) IN all_but_last_nodes_of_paths] AS paths_to_keep

// return each path
UNWIND paths_to_keep AS path
RETURN path

Upvotes: 3

Related Questions