Prabhjot
Prabhjot

Reputation: 599

MongoDB $graphLookup for finding hierarchy of related documents in a collection

I am trying to find hierarchy of documents (Aggregation) from a collection in MongoDB using $graphLookup with keys partOf and contains which are of Object datatype that store array of categories and the _id field inside.

Input collection sample (relationships_collection):

// Doc 1:

{
  "_id": "gotham_hotel",
  "category": "hotel",
  "partOf": {
    "street": [
      "kings_street_street"
    ]
  }
}
// Doc 2:

{
  "_id": "kings_street_street",
  "category": "street",
  "partOf": {
    "pincode": [
      "m24ah_pincode"
    ]
  }
}
// Doc 3:

{
  "_id": "m24ah_pincode",
  "category": "pincode",
  "partOf": {
    "city": [
      "manchester_city"
    ]
  }
}
// Doc 4:

{
  "_id": "manchester_city",
  "category": "city",
  "partOf": {
    "country": [
      "england_country"
    ]
  }
}
// Doc 5:

{
  "_id": "england_country",
  "category": "country",
  "partOf": {
    "continent": [
      "europe_continent"
    ]
  }
}
// Doc 6:

{
  "_id": "europe_continent",
  "category": "continent",
  "partOf": {
    "region": [
      "north_region"
    ]
  }
}
// Doc 7 (Not partOf any other document _id. Dead End):

{
  "_id": "north_region",
  "category": "region",
  "contains": {
    "continent": [
      "europe_continent",
      "antarctica_continent"
    ]
  }
}

Expected Output of Aggregation:

// Fetch complete hierarchy of gotham_hotel or any other document:

{
  "_id": "gotham_hotel",
  "category": "hotel",
  "partOf": {
    "street": [
      {
        "_id": "kings_street_street",
        "category": "street",
        "partOf": {
          "pincode": [
            {
              "_id": "m24ah_pincode",
              "category": "pincode",
              "partOf": {
                "city":  [
                  {
                    "_id": "manchester_city",
                    "category": "city",
                    "partOf": {
                      "country": [
                        {
                          "_id": "england_country",
                          "category": "country",
                          "partOf": {
                            "continent": [
                              {
                                "_id": "europe_continent",
                                "category": "continent",
                                "partOf": {
                                  "region": [
                                    {
                                      "_id": "north_region",
                                      "category": "region",
                                      "contains": {
                                        "continent": [
                                          "europe_continent",
                                          "antarctica_continent"
                                        ]
                                      }
                                    }
                                  ]
                                }
                              }
                            ]
                          }
                        }
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ]
  }
}

I tried using the following aggregation query, but not able to get the desired output.

[
  {
    $match: {
      _id: "gotham_hotel"
    }
  },
  {
    $graphLookup: {
      from: "relationships_collection",
      startWith: "$partOf.street",
      connectFromField: "partOf.street",
      connectToField: "_id",
      depthField: "depth",
      as: "partOfHierarchy"
    }
  },
  {
    "$set": {
      "partOf": {
        "street": {
          "$setUnion": [
            {
              "$ifNull": [
                "$partOf.steet",
                []
              ]
            },
            {
              "$reduce": {
                "input": "$partOfHierarchy.partOf.street",
                "initialValue": [],
                "in": {
                  "$setUnion": [
                    "$$value",
                    "$$this"
                  ]
                }
              }
            }
          ]
        },
        "pincode": {
          "$setUnion": [
            {
              $ifNull: [
                "$partOf.pincode",
                []
              ]
            },
            {
              "$reduce": {
                "input": "$partOfHierarchy.partOf.pincode",
                "initialValue": [],
                "in": {
                  "$setUnion": [
                    "$$value",
                    "$$this"
                  ]
                }
              }
            }
          ]
        }
      }
    }
  }
]

How can I achieve the desired output using aggregation query?

Upvotes: 0

Views: 75

Answers (2)

ray
ray

Reputation: 15276

Allow me to reiterate: Dynamic field names, dynamic nesting levels, inconsistent parent-child modelling... are all anti-patterns. Please seriously consider the invaluable advice from @joe and @cmgchess to refactor your schema.

Nevertheless, for your current scenario, if we only consider the "tree" starting from gotham_hotel, we may get close to the desired structure by doing 3 things:

  1. Ignore the north_region node. It doesn't follow the rule of being a child node that have a partOf field. Instead, it becomes a parent node with the contains field. Fortunately it is already the leave node and we do not need it to recursively find further node anymore, at least in the given example
  2. create a view to "flatten" your current documents to have static field name for $graphLookup, which is suggested by @joe
  3. With the created view in step #2, apply the concept in this answer to reform the recursive structure.

Regarding step #2, we use below command to create a view with $objectToArray to standardize the field name for $graphLookup

db.createView(
  "relationshipView",
  "relationships",
  [
    {
      $set: {
        partOf: {$first: { $objectToArray: '$partOf' }}
      }
    }
  ]
)

So a sample document from relationshipView will look like this:

{
      "_id": "gotham_hotel",
      "category": "hotel",
      "partOf": {
        "k": "street",
        "v": [
          "kings_street_street"
        ]
      }
    }

We can use partOf.v to look up _id

Next in step #3, we shamelessly apply the linked answer's concept to reconstruct the tree. This requires a minor modification, since the original question is a parent-to-child lookup, while here is a child-to-parent lookup. I don't think I can give a better explanation than the original answer. Please read the explanation there and vote it up if it helped you to learn something.

{
    $set: {
      partOf: {
        $reduce: {
          input: "$partOf",
          initialValue: {
            level: -1,
            presentPartOf: [],
            prevPartOf: []
          },
          in: {
            $let: {
              vars: {
                prev: {
                  $cond: [
                    {
                      $eq: [
                        "$$value.level",
                        "$$this.level"
                      ]
                    },
                    "$$value.prevPartOf",
                    "$$value.presentPartOf"
                  ]
                },
                current: {
                  $cond: [
                    {
                      $eq: [
                        "$$value.level",
                        "$$this.level"
                      ]
                    },
                    "$$value.presentPartOf",
                    []
                  ]
                }
              },
              in: {
                level: "$$this.level",
                prevPartOf: "$$prev",
                presentPartOf: {
                  $concatArrays: [
                    "$$current",
                    [
                      {
                        $mergeObjects: [
                          "$$this",
                          {
                            partOf: {
                              $filter: {
                                input: "$$prev",
                                as: "e",
                                cond: {
                                  $in: [
                                    "$$e._id",
                                    "$$this.partOf.v"
                                  ]
                                }
                              }
                            }
                          }
                        ]
                      }
                    ]
                  ]
                }
              }
            }
          }
        }
      }
    }
  },
  {
    "$set": {
      "partOf": "$partOf.presentPartOf"
    }
  }

Full code in Mongo Playground


Again, you are reminded about the caveats of the above anti-patterns and the difficulty of processing such a dynamically nesting structure. Refactor the schema if possible.

Upvotes: 2

Joe
Joe

Reputation: 28366

You won't get precisely that output from MongoDB.

However, in the case of a simple linked list like this, you can get an array that would be trivially converted to that hierarchy on the client side.

First, a schema change to simplify the lookup. Since each id is already tagged with its type, there may not be a need to specify the type explicitly, but if there is, adding that as a sibling field makes this easier. Something like:

      "partOf": [
        {
          "type": "street",
          "id": "kings_street_street"
        }

This simplifies the graphLookup stage because the field name is then consistent for all documents:

  {"$graphLookup": {
      "from": "relationships_collection",
      "startWith": "$partOf.id",
      "connectFromField": "partOf.id",
      "connectToField": "_id",
      "as": "LookedUp",
      "depthField": "depth"
  }}

This returns the discovered documents in an unsorted array, but with depthField, this are easily sorted:

{"$addFields": {
      "LookedUp": {
        "$sortArray": {
          "input": "$LookedUp",
          "sortBy": {"depth": 1}
        }
      }
}}

The resulting array contains the hierarchy, on client side you can iterate the array to nest each element in the one before it.

Example: Playground

Upvotes: 2

Related Questions