KOB
KOB

Reputation: 4545

Query to retrieve all paths traversable from a given vertex

I am trying to write a query that retrieves all paths that are reachable from a specified vertex. In other words I am trying to retrieve the entire cluster/sub-graph that the vertex is connected to. A couple more constraints on the query are:

  1. inward edges should be traversed and included in the result (I am looking for all paths that are in any way connected to the root vertex.
  2. the search must stop at a specified depth of, say, 10 hops from the root vertex.
  3. Bonus constraint: I would prefer the result not to include paths which are complete sub-paths of other paths returned in the result.

I currently have the following two queries which appear to work expected on small, toy graphs I have tested them on. However, there seem to be some edge cases in our large, production graph that does not return all the paths/edges/vertices I would expect it to, but I cannot explain as to why this happens. The two queries also sometimes return some different vertices than each other.

I would prefer a fresh view on how to approach this query, rather than trying to adjust what I currently have, so please try to provide a solution before looking at my current solution below.

Query 1:

g.V(uid).repeat(bothE().bothV().simplePath()).until(loops().is_(10)).emit().dedup().path().by(valueMap(True))

Query 2:

g.V(uid).repeat(bothE().bothV().simplePath()).until(bothE().simplePath().count().is_(0).or_().loops().is_(10)).dedup().path().by(valueMap(True))

Upvotes: 0

Views: 1313

Answers (1)

Kelvin Lawrence
Kelvin Lawrence

Reputation: 14371

Using this simple binary tree as a test graph

g.addV('root').property('data',9).as('root').
  addV('node').property('data',5).as('b').
  addV('node').property('data',2).as('c').
  addV('node').property('data',11).as('d').
  addV('node').property('data',15).as('e').
  addV('node').property('data',10).as('f').
  addV('node').property('data',1).as('g').
  addV('node').property('data',8).as('h').
  addV('node').property('data',22).as('i').
  addV('node').property('data',16).as('j').
  addV('node').property('data',7).as('k').
  addV('node').property('data',51).as('l').  
  addV('node').property('data',13).as('m'). 
  addV('node').property('data',4).as('n'). 
  addE('left').from('root').to('b').
  addE('left').from('b').to('c').
  addE('right').from('root').to('d').
  addE('right').from('d').to('e').
  addE('right').from('e').to('i').
  addE('left').from('i').to('j').
  addE('left').from('d').to('f').
  addE('right').from('b').to('h').
  addE('left').from('h').to('k').
  addE('right').from('i').to('l').
  addE('left').from('e').to('m').
  addE('right').from('c').to('n').
  addE('left').from('c').to('g').iterate()

We could find all the paths using

gremlin>   g.V().hasLabel('root').
......1>   repeat(bothE().otherV().simplePath()).
......2>   until(__.not(bothE().simplePath())).
......3>   path().
......4>     by('data').
......5>     by(label) 

==>[9,right,11,left,10]
==>[9,left,5,left,2,left,1]
==>[9,left,5,left,2,right,4]
==>[9,left,5,right,8,left,7]
==>[9,right,11,right,15,left,13]
==>[9,right,11,right,15,right,22,left,16]
==>[9,right,11,right,15,right,22,right,51]   

Note that I used bothE().otherV() as you said in your case you may have some incoming edges as well as outgoing ones.

We could also use the subgraph step to return the whole sub graph containing both vertices and edges. This example finds the subtree that starts at the vertex for the value 5.

gremlin>   g.V().has('data',5).
......1>   repeat(bothE().subgraph('sg').otherV().simplePath()).
......2>   until(__.not(bothE().simplePath())).
......3>   cap('sg') 
==>tinkergraph[vertices:14 edges:13]  

Note that both of these approaches assumes that all paths end at leaf nodes. I left out the loops() test but you can add that in as needed.

Upvotes: 2

Related Questions