piter12
piter12

Reputation: 19

How to use MongoDB graph lookup?

These are example documents from my collection, let's say it's called bus_routes:

{
   "_id":{
      "$oid":"56e9b39c732b6122f878576ba"
   },
   "src_busStop":"A",
   "dst_busStop":"B",
   "bus":"7318 FAF"
}
{
   "_id":{
      "$oid":"56e9b39c732b6122f878576bb"
   },
   "src_busStop":"B",
   "dst_busStop":"C",
   "bus":"7319 FAF"
}
{
   "_id":{
      "$oid":"56e9b39c732b6122f878576bc"
   },
   "src_busStop":"C",
   "dst_busStop":"D",
   "bus":"7320 FAF"
}

I want to select all connections (all buses) between cities A and D, where the maximum number of transfers is 3. My guess is I should use $graphLookup, but I have no idea how to implement it. Thanks in advance.

Upvotes: 1

Views: 3374

Answers (1)

Joe
Joe

Reputation: 28326

The graphLookup operator works best when there is a single path through the graph.

In the case of bus routes where there are multiple paths, each document along the path will be returned in a flat array. You will then need to further process the list to actually construct the sequence of routes.

These stages should get you all routes that start from "A".

maxDepth:2 will return up to 3 transfers out, restrictSearchWithMatch:{dst:{$ne:"A"}} will ignore any routes that return to "A".

     {$match:{src_busStop:"A"}},
     {$graphLookup:{
        from:"bus_routes",
        startWith:"$dst_busStop",
        connectFromField:"dst_busStop",
        connectToField:"src_busStop",
        as:"routes",
        maxDepth:2,
        depthField:"transfers",
        restrictSearchWithMatch:{dst:{$ne:"A"}}
     }}

Note that there isn't a simple way to restrict this to paths ending at "D". The "routes" array that is populated by graphLookup will contain all "bus_routes" documents that can be reached within 3 transfers from "A".

Edit

After some further thought, this can be done with 3 separate non-recursive lookups may be the way to get the information and keep the routes separate.

Here is an example pipeline to find the routes from "A" to "D". This feels a bit brute force to me, but I don't have a more elegant solution yet. The steps are:

  1. Select all routes originating from the start point
  2. Save the original document in a "route" field for use later
  3. Lookup all routes originating from the next reachable station, store in "firstxfer" field
  4. Unwind the firstxfers so we can consider each route separately
  5. Discard all routes that go back to where they came from
  6. Repeat 3-5 to generate secondxfer and thirdxfer
  7. Combine the original document and each transfer document in reverse order in the "route" field
  8. Reduce the array, eliminating any routes after the destination is reached, and reversing the order
  9. Group with addToSet to eliminate duplicate routes
  10. Unwind the routes again so we can sort
  11. Add the number of routes required in the "buses" field
  12. Sort so the shortest routes come first
db.bus_routes.aggregate([
    {$match:{src_busStop:"A"}},
    {$addFields:{route:["$$ROOT"]}},
    {$lookup:{from:"bus_routes",localField:"dst_busStop",foreignField:"src_busStop",as:"firstxfer"}},
    {$unwind:"$firstxfer"},
    {$match:{"firstxfer.dst_busStop":{$ne:"A"}}},
    {$addFields:{firstxfer:{$cond:[{$eq:["D","$firstxfer.src_busStop"]},[],"$firstxfer"]}}},
    {$lookup:{from:"bus_routes",localField:"firstxfer.dst_busStop",foreignField:"src_busStop",as:"secondxfer"}},
    {$unwind:{path: "$secondxfer", preserveNullAndEmptyArrays: true}},
    {$match:{"secondxfer.dst_busStop":{$ne:"A"}}},
    {$addFields:{secondxfer:{$cond:[{$eq:["D","$secondxfer.src_busStop"]},[],"$secondxfer"]}}},
    {$lookup:{from:"bus_routes",localField:"secondxfer.dst_busStop",foreignField:"src_busStop",as:"thirdxfer"}},
    {$unwind: {path: "$thirdxfer", preserveNullAndEmptyArrays: true}},
    {$addFields:{thirdxfer:{$cond:[{$eq:["D","$thirdxfer.src_busStop"]},[],"$thirdxfer"]}}},
    {$project:{
        route:["$thirdxfer","$secondxfer","$firstxfer","$route"]
    }},
    {$match:{"route.dst_busStop":"D"}},
    {$project:{
        route:{
            $reduce:{
                input:"$route",
                initialValue:[],
                in:{$cond:[
                           {$or:[{$eq:["D","$$this.dst_busStop"]},{$gt:[{$size:"$$value"},0]}]},
                           {$concatArrays:[["$$this"],"$$value"]},
                           "$$value"
                           ]}
            }
        }
    }},
    {$group:{
        _id:null,
        route:{$addToSet:"$route"}
    }},
    {$unwind:"$route"},
    {$addFields:{ buses:{$size:"$route"}}},
    {$sort:{buses:1}}
])

I set up a test data set in the Playground to demonstrate this.

Upvotes: 4

Related Questions