Reputation: 51073
Given a tree-structured TinkerPop graph with vertices connected by labeled parent-child relationships ([parent-PARENT_CHILD->child]
), what's the idiomatic way to traverse and find all those nodes?
I'm new to graph traversals, so it seems more or less straightforward to traverse them with a recursive function:
Stream<Vertex> depthFirst(Vertex v) {
Stream<Vertex> selfStream = Stream.of(v);
Iterator<Vertex> childIterator = v.vertices(Direction.OUT, PARENT_CHILD);
if (childIterator.hasNext()) {
return selfStream.appendAll(
Stream.ofAll(() -> childIterator)
.flatMap(this::depthFirst)
);
}
return selfStream;
}
(N.b. this example uses Vavr streams, but the Java stream version is similar, just slightly more verbose.)
I assume a graph-native implementation would be more performant, especially on databases other than the in-memory TinkerGraph.
However, when I look at the TinkerPop tree recipes, it's not obvious what combination of repeat()
/ until()
etc. is the right one to do what I want.
If I then want to find only those vertices (leaf or branch) having a certain label, again, I can see how to do it with the function above:
Stream<Vertex> nodesWithMyLabel = depthFirst(root)
.filter(v -> "myLabel".equals(v.label()));
but it's far from obvious that this is efficient, and I assume there must be a better graph-native approach.
Upvotes: 2
Views: 1142
Reputation: 51073
Here's what I ended up with, after a certain amount of trial and error:
GraphTraversal<Vertex, Vertex> traversal =
graph.traversal().V(parent)
.repeat(out(PARENT_CHILD)) // follow only edges labeled PARENT_CHILD
.emit()
.hasLabel("myLabel"); // filter for vertices labeled "myLabel"
Note that this is slightly different from the recursive version in the original question since I realized I don't actually want to include the parent in the result. (I think, from the Repeat Step docs, that I could include the parent by putting emit()
before repeat()
, but I haven't tried it.)
Upvotes: 0
Reputation: 46206
If you are using TinkerPop, it is best to just write your traversals with Gremlin. Let's use the tree described in the recipe:
g.addV().property(id, 'A').as('a').
addV().property(id, 'B').as('b').
addV().property(id, 'C').as('c').
addV().property(id, 'D').as('d').
addV().property(id, 'E').as('e').
addV().property(id, 'F').as('f').
addV().property(id, 'G').as('g').
addE('hasParent').from('a').to('b').
addE('hasParent').from('b').to('c').
addE('hasParent').from('d').to('c').
addE('hasParent').from('c').to('e').
addE('hasParent').from('e').to('f').
addE('hasParent').from('g').to('f').iterate()
To find all the children of "A", you simply do:
gremlin> g.V('A').repeat(out()).emit()
==>v[B]
==>v[C]
==>v[E]
==>v[F]
The traversal above basically says, "Start at 'A" vertex and traverse on out edges until there are no more, and oh, by the way, emit each of those child vertices as you go." If you want to also get the root of "A" then you just need to switch things around a bit:
gremlin> g.V('A').emit().repeat(out())
==>v[A]
==>v[B]
==>v[C]
==>v[E]
==>v[F]
Going a step further, if you want to emit only certain vertices based on some filter (in your question you specified label) you can just provide a filtering argument to emit()
. In this case, I only emit those vertices that have more than one incoming edge:
gremlin> g.V('A').emit(inE().count().is(gt(1))).repeat(out())
==>v[C]
==>v[F]
Upvotes: 5