pro-grammar
pro-grammar

Reputation: 21

TinkerGraph g.V(ids).drop().iterate() is confusingly slow

Have run into an issue with using plain old TinkerGraph to drop a moderate sized number of vertices. In total, there are about 1250 vertices and 2500 edges that will be dropped.

When running the following:

g.V(ids).drop().iterate()

It takes around 20-30 seconds. This seems ridiculous and I have seemingly verified that it is not caused by anything other than the removal of the nodes.

I'm hoping there is some key piece that I am missing or an area I have yet to explore that will help me out here.

The environment is not memory or CPU constrained in any way. I've profiled the code and see the majority of the time spent is in the TinkerVertex.remove method. This is doubly strange because the creation of these nodes takes less than a second.

I've been able to optimize this a bit by doing a batching and separate threads solution like this one: Improve performance removing TinkerGraph vertices vertices

However, 10-15 seconds is still too long as I'm hoping to have this be a synchronous operation.

I've considered following something like this but that feels like overkill for dropping less than 5k elements...

To note, the size of the graph is around 110k vertices and 150k edges.

I've tried to profile the gremlin query but it seems that you can't profile through the JVM using:

g.V(ids).drop().iterate().profile()

I've tried various ways of writing the query for profiling but was unable to get it to work.

I'm hoping there is just something I'm missing that will help get this resolved.

Upvotes: 2

Views: 330

Answers (1)

Kelvin Lawrence
Kelvin Lawrence

Reputation: 14371

As mentioned in comments, it definitely seems unusual that this operation is taking so long, unless the machine being used is very busy performing other tasks. Using my laptop (16GB RAM, modest CPU and other specs) I can drop the air-routes graph (3,747 nodes and 57,660 edges) in milliseconds time from the Gremlin console.

gremlin> Gremlin.version
==>3.6.0

gremlin> g
==>graphtraversalsource[tinkergraph[vertices:3747 edges:57660], standard]

gremlin> g.V().drop().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[])                                          3747        3747           6.226     7.52
DropStep                                                                                      76.587    92.48
                                            >TOTAL                     -           -          82.813        -

gremlin> g
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]   

I also tried dropping a list of 1000 nodes as follows but still experienced millisecond time.

gremlin> g
tinkergraph[vertices:3747 edges:57660]


gremlin> a=[] ; for (x in (1..1000)) {a << x}
==>null

gremlin> a.size()
==>1000

gremlin> g.V(a).drop().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[1, 2, 3, 4, 5, 6, 7, 8,...                  1000        1000           2.677    13.87
DropStep                                                                                      16.626    86.13
                                            >TOTAL                     -           -          19.304        -

gremlin> g
==>graphtraversalsource[tinkergraph[vertices:2747 edges:9331], standard]   

Perhaps see if you can get a profile from your Java code using a query without iterate (it's not needed as profile is a terminal step). Also check for any unusual GC activity. I would also see if you see this same issue using the Gremlin Console. Something is definitely odd here. If none of these investigations bear fruit perhaps update the question to show the exact Java code you are using.

Upvotes: 1

Related Questions