Alan42
Alan42

Reputation: 339

Updating the properties of many edges in JanusGraph(gremlin) is very slow

There is a JanusGraph database with a large amount of data, which contains about 600 million edges.

Each side has the properties "hash" and "timestamp".

I want to update the "timestamp" property on each edge based on the "hash" property. However, the speed of update is very slow.

Here is how did I build index for "hash" property:

hash = mgmt.makePropertyKey('hash').dataType(String.class).cardinality(SINGLE).make()

mgmt.buildIndex('HashComposite', Edge.class).addKey(hash).buildCompositeIndex()

And here is how did I update the "timestamp" property:

g.E().has('hash', 'c2719586fb6a26d492bf65a0263a1c52f5ff6ef3').property("timestamp", timestamp).next()

In fact, I found that even if I don't update the timestamp property, but just traverse the edges, the speed is still very slow:

g.E().has('hash', 'c2719586fb6a26d492bf65a0263a1c52f5ff6ef3').next()

Why is the speed of traversing edges so slow when index of "hash" is built?

Upvotes: 0

Views: 398

Answers (1)

Kfir Dadosh
Kfir Dadosh

Reputation: 1419

You can run profile on the query to verify that index is actually being used:

g.E().has('hash', hash).profile()

Did you create the index before loading the graph data, or after? If after, you should reindex the data first:

mgmt.updateIndex(mgmt.getGraphIndex("HashComposite"), SchemaAction.REINDEX).get()

Upvotes: 2

Related Questions