coderz
coderz

Reputation: 4989

Gremlin query hangs with and/or condition

I have a Graph model region (vertex) -> has_person (edge) -> person (vertex). I want to get region vertices that has person with name Tom.

This query works fine: g.V().hasLabel("person").has("name", "Tom").inE("has_person").outV().hasLabel("region").

But why following queries hang:

g.V().hasLabel("region").and(
    __.hasLabel("person").has("name", "Tom").inE("has_person").outV().hasLabel("region")
)

g.V().and(
    __.hasLabel("person").has("name", "Tom").inE("has_person").outV().hasLabel("region")
).hasLabel("region")

Upvotes: 0

Views: 393

Answers (1)

stephen mallette
stephen mallette

Reputation: 46206

When writing graph traversals with Gremlin you need to think about how the graph database you are using is optimizing your traversal (e.g. is a global index being used)?

You should consider the indexing capability of your graph database and examine the output of the profile() step. It will tell you if indices are being used and where. My guess is that the query that works "fine" is using and index to find "Tom" and then is able to quickly traverse that one index to find the regions that have "has_person" edges related to him. Most every graph will be capable of optimizing that sort of pattern. Your following queries that "hang" will typically not be optimized by most graphs to utilize an index and it's mostly because of the pattern you've chosen with and() step which isn't a pattern most optimizations seek. My guess is that both of those traversals are filtering almost completely in-memory.

Fwiw, your query that works "fine" is the optimal way to write that I think given what you state as your desired output. Your first hanging query I don't think will ever return results because it requires that the vertex have a label that is both "region" and "person" which is not possible. The second hanging query seems to not require the and() in the first place and is double filtering the "region" label.

Upvotes: 1

Related Questions