Why does hasNot "resurrect" a dead traversal?

I'm trying to create a traversal that adds a vertex and then adds an edge from a known vertex to the new one. I have a library method that uses coalesce to check to see whether an existing edge is present (which it can't be) and add it if not. However, I'm reliably getting an edge added to the first child vertex and then no edges added to new child vertices. Here's the traversal:

gts.addV()... // add properties and such; this part reliably works
  .as('newV') // needed because I can't pass __addV() to the edge coalesce or I get two adds
  .V(parent)
  .coalesce(
    __.outE('Manages').where(inV().has(id, __select('newV').id())).hasNot(TTL_END),
    __.addE('Manages').to(__select('newV')).property(TTL_START, now)
  )

When I profile this traversal, I find something odd when adding the second and subsequent child vertices:

CoalesceStep([[VertexStep(OUT,[Manages],edge), ...                     1           1           0.469    18.82
  VertexStep(OUT,[Manages],edge)                                       1           1           0.020
  TraversalFilterStep([EdgeVertexStep(IN), Prof...                     1           1           0.188
    EdgeVertexStep(IN)                                                 1           1           0.010
    TraversalFilterStep([IdStep, ProfileStep, S...                                             0.091
      IdStep                                                           1           1           0.010
      SelectOneStep(last,newV)                                         1           1           0.020
      NoOpBarrierStep(2500)                                            1           1           0.019
      IdStep                                                                                   0.012
  NotStep([PropertiesStep([ttl.end],value), Pro...                     1           1           0.170
    PropertiesStep([ttl.end],value)                                                            0.008
EdgeVertexStep(IN)                                                     1           1           0.305    12.23

As best I can understand, this seems to be saying that the "id filter" is filtering out the non-matching new child1 and thus returning no traversers (which is what I expect), but then the hasNot step, which I expect to be applied in the pipeline following, pops back out to the top level, says "the edge (to the first child) doesn't have ttl.end, so I'll return it!", the coalesce takes it, and I don't get my edge to the second child.

My understanding was that once a traverser "died", additional filtering steps would simply be discarded as redundant, and nothing more would be propagated through the traversal, but what I expected to behave as an AND-filter instead seems to be "resurrecting" the traverser the ID filter should have "killed".

Why is the NotStep getting traversed even though its upstream filter shouldn't have matched? How can I produce the compound predicate I'm aiming for?

(I also tried this with the hasNot step first, and I got the same results, with the steps transposed in the profile output.)


1 Is the lack of any traverser at all on the second `IdStep` an indication of a problem?

Upvotes: 0

Views: 49

Answers (1)

stephen mallette
stephen mallette

Reputation: 46216

The has(String,Traversal) step is perhaps the most mis-used one in existence. Users expect it to mean, "resolve the Traversal to a value and have that value be the equality comparison to the specified key." But, as you can see, that's not what it does:

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().as('a').has(id, select('a'))
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]

The documentation tries to explain it with this (and perhaps other explanations/examples elsewhere):

has(key, traversal): Remove the traverser if its object does not yield a result through the traversal off the property value.

So with that context, consider what is happening if you use select() for that Traversal argument - it effectively ignores the "property value" as the traverser and inserts it's own from whatever returns from select(). Unless you select something that doesn't exist, it returns a value and therefore the filter passes. We've discussed changing this behavior, but there are concerns about breaking existing code that relies on this feature. Perhaps it could yet change in the future...

All that said, I think I'd re-write your traversal as follows:

gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV('person').property(id,'parent')
==>v[parent]
gremlin> now = 100
==>100
gremlin> g.addV('person').as('newV'). 
......1>   V('parent').
......2>   coalesce(
......3>     __.outE('Manages').where(inV().where(eq('newV'))).hasNot("end"),
......4>     __.addE('Manages').to(__.select('newV')).property("start", now))
==>e[1][parent-Manages->0]
gremlin> g.addV('person').as('newV'). 
......1>   V('parent').
......2>   coalesce(
......3>     __.outE('Manages').where(inV().where(eq('newV'))).hasNot("end"),
......4>     __.addE('Manages').to(__.select('newV')).property("start", now))
==>e[3][parent-Manages->2]
gremlin> g.E().property('end',101)
==>e[1][parent-Manages->0]
==>e[3][parent-Manages->2]
gremlin> g.addV('person').as('newV'). 
......1>   V('parent').
......2>   coalesce(
......3>     __.outE('Manages').where(inV().where(eq('newV'))).hasNot("end"),
......4>     __.addE('Manages').to(__.select('newV')).property("start", now))
==>e[5][parent-Manages->4]

The nested where() is probably the better way to express this filter.

Upvotes: 1

Related Questions