mabr
mabr

Reputation: 380

ArangoDB: Find all shortest paths

i want to get all shortest paths between 2 vertex.

Example: Give me all shortest path between node A and B should only return the 2 blue paths

example graph

this is what i have got so far:

LET source = (FOR x IN Entity FILTER x.objectID == "organization_1"
    return x)[0]
LET destination = (FOR x IN Entity FILTER x.objectID == "organization_129"
    return x)[0]
FOR node, edge, path IN 1..2 ANY source._id GRAPH "m" 
    FILTER LAST(path.vertices)._id == destination._id
    LIMIT 100
RETURN path

Problems: 1. it is very slow (took 18 seconds on a graph with like 70 mio nodes) 2. it finds every path, but i want only all shortest path

UPDATE i tried the 2-step query solution from the comments. the problem is that the second query is also very slow

Query string:
 FOR source IN Entity FILTER source.objectID == "organization_1"
 LIMIT 1
 FOR node, edge, path 
   IN 1..@depth ANY source._id 
   GRAPH "m" 
   OPTIONS {uniqueVertices: "path"}
   FILTER node.objectID == "organization_129"
   RETURN path

Execution plan:
 Id   NodeType          Est.   Comment
  1   SingletonNode        1   * ROOT
 11   IndexNode            1     - FOR source IN Entity   /* hash index scan */
  5   LimitNode            1       - LIMIT 0, 1
  6   CalculationNode      1       - LET #6 = source.`_id`   /* attribute expression */   /* collections used: source : Entity */
  7   TraversalNode      346       - FOR node  /* vertex */, path  /* paths */ IN 1..2  /* min..maxPathDepth */ ANY #6 /* startnode */  GRAPH 'm'
  8   CalculationNode    346       - LET #10 = (node.`objectID` == "organization_129")   /* simple expression */
  9   FilterNode         346       - FILTER #10
 10   ReturnNode         346       - RETURN path

Indexes used:
 By   Type   Collection      Unique   Sparse   Selectivity   Fields                       Ranges
 11   hash   Entity          false    false       100.00 %   [ `objectID` ]   (source.`objectID` == "organization_1")
  7   edge   ACTIVITYPARTY   false    false       100.00 %   [ `_from`, `_to` ]           base INBOUND
  7   edge   ACTIVITYPARTY   false    false       100.00 %   [ `_from`, `_to` ]           base OUTBOUND
  7   edge   ACTIVITY_LINK   false    false       100.00 %   [ `_from`, `_to` ]           base INBOUND
  7   edge   ACTIVITY_LINK   false    false       100.00 %   [ `_from`, `_to` ]           base OUTBOUND
  7   edge   ENTITY_LINK     false    false        70.38 %   [ `_from`, `_to` ]           base INBOUND
  7   edge   ENTITY_LINK     false    false        70.38 %   [ `_from`, `_to` ]           base OUTBOUND
  7   edge   RELATION        false    false        20.49 %   [ `_from`, `_to` ]           base INBOUND
  7   edge   RELATION        false    false        20.49 %   [ `_from`, `_to` ]           base OUTBOUND
  7   edge   SOFT_LINK       false    false       100.00 %   [ `_from`, `_to` ]           base INBOUND
  7   edge   SOFT_LINK       false    false       100.00 %   [ `_from`, `_to` ]           base OUTBOUND

Traversals on graphs:
 Id   Depth   Vertex collections                      Edge collections                                                 Options                                   Filter conditions
  7   1..2    Activity, Entity, SOFT_LINK, Property   ACTIVITYPARTY, ENTITY_LINK, SOFT_LINK, RELATION, ACTIVITY_LINK   uniqueVertices: path, uniqueEdges: path   

Optimization rules applied:
 Id   RuleName
  1   move-calculations-up
  2   move-filters-up
  3   move-calculations-up-2
  4   move-filters-up-2
  5   use-indexes
  6   remove-filter-covered-by-index
  7   remove-unnecessary-calculations-2
  8   optimize-traversals
  9   move-calculations-down

Upvotes: 2

Views: 1268

Answers (1)

mpv89
mpv89

Reputation: 1891

First of all you need a hash index on field objectID in collection Entity to avoid the full collection scans, which heavily slows down your performance.

To get all shortest path I would first search for one shortest path with the AQL SHORTEST_PATH and return the number of visited vertices. There is also no need of subqueries (like in your query).

FOR source IN Entity FILTER source.objectID == "organization_1"
LIMIT 1
FOR destination IN Entity FILTER destination.objectID == "organization_129"
LIMIT 1
RETURN sum(
  FOR v, e
  IN ANY
  SHORTEST_PATH source._id TO destination._id
  GRAPH "m"
  RETURN 1)-1

After that I would execute another query with the result from the first query as bind parameter @depth, which is used to limit the depth of the traversal.

FOR source IN Entity FILTER source.objectID == "organization_1"
LIMIT 1
FOR node, edge, path 
  IN 1..@depth ANY source._id 
  GRAPH "m" 
  OPTIONS {uniqueVertices: "path"}
  FILTER node.objectID == "organization_129"
  RETURN path

Note: To filter the last vertex in the path you don't have to use LAST(path.vertices), you can simply use node because it is already the last vertex (the same applies for edge).

Upvotes: 1

Related Questions