Reputation: 105
I have a question about shortestPath algorithm in neo4j.
If I have a graph with 10^6 nodes and each node has 1000 relationships, searching for the shortest path up to 4 levels, must search for 1000*1000*1000*1000=10^12 nodes that is higher than total nodes. The reason is that some nodes are repeated during search. My question is that in neo4j shortestPath algorithm, this search takes time of touching 10^6 nodes or 10^12 nodes. In other words, does it mark up nodes that are already searched to not search them again?
Thanks
Upvotes: 1
Views: 276
Reputation: 30397
I don't believe that kind of pruning is used. In Cypher, the default uniqueness for traversals is RELATIONSHIP_PATH: within each path, a relationship must be unique, they can't be reused.
You might try using either the shortestPath proc in the Graph Algorithms project or one of APOC Procedures' path expander procs instead.
With APOC path expanders, you can either set the uniqueness yourself to NODE_GLOBAL (which prevents processing of the same nodes multiple times during all expansions), or use one of the procs that already does this under the hood (subgraphNodes()
, subgraphAll()
, or spanningTree()
).
The gotchas (at the moment) with APOC are that you can't currently supply the end nodes of the expansion (you'll have to expand out to nodes with certain defined labels and filter your results after with a WHERE clause), and expansions only go in one direction (from start node out) instead of bi-directional (such as from cypher's shortestPath()), so you won't realize any efficiency improvements that can happen from expanding from the other direction.
I currently have a PR on APOC to supply known end nodes of the expansion, so that should make it into the next APOC release (within the next week or so).
Upvotes: 1