Hieu Nguyen
Hieu Nguyen

Reputation: 404

JanusGraph Gremlin graph traversal with `as` and `select` provides unexpected result

I have two graph traversals with the following results:

g.V().has("id", 2).outE("knows").inV()
==>v[4216]
==>v[8312]
g.V().has("id", 5).outE("knows").inV()
==>v[4216]
==>v[8312]

Basically, both vertices with id 2 and 5 have edges to the same two other vertices v[4216] and v[8312].

Now if I chain those two above queries and tag them, and then select the first one, the result is not as expected.

g.V().has("id", 2).outE("knows").inV().dedup().as('a').V().has('id', 5).outE('knows').inV().dedup().as('b').select('a')
==>v[4216]
==>v[4216]

I expected that as I only select a, the result should be the same as executing the first graph traversal, which should return v[4216] and v[8312].

Do you know what could be an issue?

JanusGraph version is 0.5.3, and Tinkerpop is 3.4.6

Upvotes: 2

Views: 399

Answers (1)

Kelvin Lawrence
Kelvin Lawrence

Reputation: 14371

This is actually working as expected. The second dedup is removing the traversers that carried the other vertices. Note also that your second V causes some additional fanning out of the query. Here is an example that I hope makes it clear.

Using this graph:

g.addV('a').as('a').
  addV('b').as('b').
  addV('c').as('c').
  addV('d').as('d').
  addE('knows').from('a').to('c').
  addE('knows').from('a').to('d').
  addE('knows').from('b').to('c').
  addE('knows').from('b').to('d')   

We can inspect the flow of the query:

With the second dedup

gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().dedup().as('b').select('
a').label()
==>c
==>c

Without the second dedup


gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().as('b').select('a').labe
l()
==>c
==>c
==>d
==>d

Using a path step we can see exactly what happened

gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().dedup().as('b').select('
a').path()
==>[v[0],e[4][0-knows->2],v[2],v[1],e[6][1-knows->2],v[2],v[2]]
==>[v[0],e[4][0-knows->2],v[2],v[1],e[7][1-knows->3],v[3],v[2]]

gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().as('b').select('a').path
()
==>[v[0],e[4][0-knows->2],v[2],v[1],e[6][1-knows->2],v[2],v[2]]
==>[v[0],e[4][0-knows->2],v[2],v[1],e[7][1-knows->3],v[3],v[2]]
==>[v[0],e[5][0-knows->3],v[3],v[1],e[6][1-knows->2],v[2],v[3]]
==>[v[0],e[5][0-knows->3],v[3],v[1],e[7][1-knows->3],v[3],v[3]]

Here are the same queries but with just the labels shown in the resuls.

gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().dedup().as('b').select('
a').path().by(label)
==>[a,knows,c,b,knows,c,c]
==>[a,knows,c,b,knows,d,c]   

gremlin> g.V().hasLabel('a').outE("knows").inV().dedup().as('a').V().hasLabel('b').outE('knows').inV().as('b').select('a').path
().by(label)
==>[a,knows,c,b,knows,c,c]
==>[a,knows,c,b,knows,d,c]
==>[a,knows,d,b,knows,c,d]
==>[a,knows,d,b,knows,d,d]  

Upvotes: 4

Related Questions