sameetm
sameetm

Reputation: 68

gremlin hasLabel query times out

I have a test graph with less than a million nodes and probably a slightly higher number of edges. I'm using a remote gremlin client to connect to a janusgraph/gremlin-server instance backed by 3 scylla backends.

I have various different labeled nodes i.e url, domain, host and brand. The graph contains mainly url, domain, and host nodes. I have one brand node in this entire graph. The brand node looks like this:

{
    label: brand 
    properties: {
        brand: string
    }
}

I am able to do the following query in 1.5 ms. The brand property has a composite index.

g.V().hasLabel('brand').has('brand','stackoverflow');

The query below hits the 30s timeout. I expect this query to only return only one result based on the data I imported into the graph. I verified by testing with a limit

g.V().hasLabel('brand')

My questions

Thank you

Upvotes: 0

Views: 526

Answers (1)

bechbd
bechbd

Reputation: 6341

  • Why does this timeout?
  • Is Janusgraph scanning through all nodes in the graph to try find a single node labeled 'brand'? Is there no default index on labels?

As you have guessed this is likely timing out due to a full graph scan since vertex labels are not indexed in JanusGraph. There is an open issue for this: https://github.com/JanusGraph/janusgraph/issues/283

  • Why does the first query execute fine when the first steps for both are the same?

In this case I suspect that JanusGraph's optimizer is able to optimize the traversal plan to use the composite index.

Upvotes: 1

Related Questions