Reputation: 1503
I'm employing multi-threaded transactions as described by JanusGraph docs. Each of my threads contributes to building a directory tree. Before inserting a new vertex for a specific directory, each thread first checks if such a vertex already exists within the same query. Vertexes are only inserted with .orElseGet
if no existing one can be found.
Vertex vertex = graph.traversal().V()
.hasLabel(VertexLabels.DIRECTORY)
.has(PropertyKeys.PATH, directory.path())
.tryNext()
.orElseGet(() -> {
return graph.addVertex(
T.label, VertexLabels.DIRECTORY,
PropertyKeys.PATH, directory.path());
});
Technically, this should prevent duplicates assuming that all threads operate within the same transactional scope. I do however encounter duplicates. The docs don't seem to give any answers regarding this issue. Can you confirm whether multi-threaded transactions operate within the same scope?
Upvotes: 0
Views: 131
Reputation: 46226
Multi-threaded transactions operate in the same scope, but I suppose it remains possible for the threads to race if you haven't configured a unique constraint on PropertyKeys.PATH
. Doing so does mean that locking would be enabled which might slow down your ingestion rate but will ensure uniqueness.
As a side note please consider avoiding use of the Graph API (graph.addVertex()
) and sticking to pure Gremlin - the "get or create" pattern is described here.
Upvotes: 3