user7951064
user7951064

Reputation:

How to Join Arrays in the Same Document?

I would like to combine the data in one collection using the IDs of the two arrays.

An example is shown below.

{
    "_id": ObjectId ("5976fd2eb0adec0a32fa9831"),
       "People": [
          {
            "_id": 1,      <--- ID
            "Name": "jane"
          },
          {
            "_id": 2,      <--- ID
            "Name": "Mark"
          }
       ],
       "Contents": [
          {
            "userID":  2,   <--- People ID
            "Text": "111"
          },
          {
            "userID":  1,   <--- People ID
            "Text": "Hi"
          }
       ]
}

I want to make the above document as below.

{
    "_id": ObjectId ("5976fd2eb0adec0a32fa9831"),
    "People": [
       {
          "_id": 1,
          "Name" : "Jane"
       },
       {
          "_id": 2,
          "Name": "Mark"
       }
    ],
    "Contents": [
       {
          "userID": 2,
          "Name": "Mark",    <-- Adding
          "Text": "111",

      },
       {
          "userID": 1,
          "Name": "Jane",    <-- Adding
          "Text": "Hi",

      }
    ]
}

I have tried various things like $lookup or $unwind of .aggregate() but I cannot get the result.

Upvotes: 5

Views: 2067

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151092

You want $map and $indexOfArray ideally:

db.collection.aggregate([
  { "$addFields": {
    "Contents": {
      "$map": {
        "input": "$Contents",
        "as": "c",
        "in": {
          "userID": "$$c.userID",
          "Name": {
            "$arrayElemAt": [
              "$People.Name",
              { "$indexOfArray": [ "$People._id", "$$c.userID" ] }
            ]
          },
          "Text": "$$c.Text"
        }
      }
    }
  }}
])

Which basically grabs the value from the other array via $arrayElemAt for the matching "index" returned by $indexOfArray.

If your MongoDB needs to fall back a version without that operator, then you could use $filter instead:

db.collection.aggregate([
  { "$addFields": {
    "Contents": {
      "$map": {
        "input": "$Contents",
        "as": "c",
        "in": {
          "userID": "$$c.userID",
          "Name": {
            "$arrayElemAt": [
              { "$map": {
                "input": { 
                  "$filter": {
                    "input": "$People",
                    "as": "p",
                    "cond": { "$eq": [ "$$p._id", "$$c.userID" ] }
                  }
                },
                "as": "p",
                "in": "$$p.Name"
              }},
              0
            ]
          },
          "Text": "$$c.Text"
        }
      }
    }
  }}
])

Where basically you $filter the results down of the other array in comparison and simply return the first matching element by the 0 index with $arrayElemAt.

In either case, there is no need to "self-join" using $lookup, and that's just really unnecessary overhead best avoided.

From the document in the question you get the following:

/* 1 */
{
    "_id" : ObjectId("5976fd2eb0adec0a32fa9831"),
    "People" : [ 
        {
            "_id" : 1.0,
            "Name" : "jane"
        }, 
        {
            "_id" : 2.0,
            "Name" : "Mark"
        }
    ],
    "Contents" : [ 
        {
            "userID" : 2.0,
            "Name" : "Mark",
            "Text" : "111"
        }, 
        {
            "userID" : 1.0,
            "Name" : "jane",
            "Text" : "Hi"
        }
    ]
}

Generally speaking though, there is no such reason for any aggregation operators at all, as this sort of operation is generally best left to post-processing in the cursor. In fact since you are actually "adding" data to the document to return, it's better to do modification after the document is sent over the network.

As a common idiom of the above shown as JavaScript for the shell:

db.collection.find().map( d => 
  Object.assign(
    d,
    {
      "Contents": d.Contents.map( c => 
        Object.assign(c, 
          { "Name": d.People.map(p => p.Name)[d.People.map(p => p._id).indexOf(c.userID)] }
        )
      )
    }
  )
)

Produces the exact same result, and is generally a bit easier on the eyes to read and interpret

Upvotes: 7

Related Questions