Henry Bai
Henry Bai

Reputation: 447

TinkerPop gremlin count vertices only in a path()

When I make a query of a path e.g.:

g.V(1).inE().outV().inE().outV().inE().outV().path()

There are both vertices and edges in the path(), is there any way to count the number of vertices in the path only and ignore edges?

Upvotes: 4

Views: 1313

Answers (2)

Kfir Dadosh
Kfir Dadosh

Reputation: 1419

+1 for typeOf predicate support in Gremlin (TINKERPOP-2234).

In addition to @stephan's answer, you can also mark and select only vertices:

g.V().repeat(outE().inV().as('v')).times(3).select(all,'v')

Also, if the graph provider support it, you can also use {it.class}:

g.V().repeat(outE().inV().as('v')).times(3).path()
    .map(unfold().groupCount().by({it.class}))

Upvotes: 1

stephen mallette
stephen mallette

Reputation: 46216

Gremlin is missing something important to make this really easy to do - it doesn't discern types very well for purposes of filtering, thus TINKERPOP-2234. I've altered your example a bit so that we could have something a little trickier to work with:

gremlin> g.V(1).repeat(outE().inV()).emit().path()
==>[v[1],e[9][1-created->3],v[3]]
==>[v[1],e[7][1-knows->2],v[2]]
==>[v[1],e[8][1-knows->4],v[4]]
==>[v[1],e[8][1-knows->4],v[4],e[10][4-created->5],v[5]]
==>[v[1],e[8][1-knows->4],v[4],e[11][4-created->3],v[3]]

With repeat() we get variable length Path instances so dynamic counting of the vertices is a bit trickier than the fixed example you have in your question where the pattern of the path is known and a count is easy to discern just from the Gremlin itself. So, with a dynamic number of vertices and without TINKERPOP-2234 you have to get creative. A typical strategy is to just filter away the edges by way of some label or property value that is unique to vertices:

gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().hasLabel('person','software').fold())
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[4],v[5]]
==>[v[1],v[4],v[3]]
gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().hasLabel('person','software').fold()).count(local)
==>2
==>2
==>2
==>3
==>3

Or perhaps use an property unique to all edges:

gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().not(has('weight')).fold())
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[4],v[5]]
==>[v[1],v[4],v[3]]
gremlin> g.V(1).repeat(outE().inV()).emit().path().map(unfold().not(has('weight')).fold()).count(local)
==>2
==>2
==>2
==>3
==>3

If you don't have these properties or labels in your schema that allows for this you could probably use your traversal pattern to come up with some math to figure it out. In my case, i know that my Path will always be (pathLength + 1) / 2 so:

gremlin> g.V(1).repeat(outE().inV()).emit().path().as('p').math('(p + 1) / 2').by(count(local))
==>2.0
==>2.0
==>2.0
==>3.0
==>3.0

Hopefully, one of those ways will inspire you to a solution.

Upvotes: 4

Related Questions