Reputation: 97
I´m currently working with janusGraph in an hadoop Environment. I´ve already loaded a bigger amount of Vertices into the graph (about half a million) and got an index for the primary Key running. Iterating every vertex takes about 3 minutes. I´ve currently 0 edges in my graph.
For the loading of my graph-edges i´m reading out a csv-file which contains the data. As i´m sometimes facing Timeouts (because of the environment) i´ve been first looking for the count of vertices and then skip to the correct row in the csv, to restart the loading.
However, asking for the count of edges to do the same with my edge-csv-file takes about 4 minutes and produces a timeout for my tinkerpopserver.
Is there a way to get the total count of edges in the graph without iterating every single vertex?
Adding the edges itself works fine, as the composite index for the vertices is quite fast.
Upvotes: 1
Views: 353
Reputation: 46206
Given the way that edges are stored for JanusGraph g.E()
will basically iterate all vertices to get the edges so there's not much you can do to get a count. It is worth noting that iterating edges is a graph specific issue, so it is possible that other graphs may behave differently. For example, TinkerGraph handles counting with a strategy that bypasses iteration completely.
Upvotes: 2