sachin.pandey
sachin.pandey

Reputation: 39

MongoDB merge two collections with unmatched documents

I am trying to compare and find different documents from two collections below are the samples, Mongodb version:4.0, ORM:mongoose

**col1: Has one new document**

    { "id" : 200001, "mobileNo" : #######001 }
    { "id" : 200002, "mobileNo" : #######002 } //mobileNo may not be unique.
    { "id" : 200003, "mobileNo" : #######002 }
    { "id" : 200004, "mobileNo" : #######004 }

**col2:**

    { "id" : 200001, "mobileNo" : #######001 }
    { "id" : 200002, "mobileNo" : #######002 }
    { "id" : 200003, "mobileNo" : #######003 }

Now I want to insert the document { "id" : 200004, "mobileNo" : #######004 } from col1 to col2 i.e; the documents which doesn't match.

This is what I've tried so far :

const col1= await Col1.find({}, { mobileNo: 1,id: 1, _id: 0 })
 col1.forEach(async function (col1docs) {
    let col2doc = await Col2.find({ mobileNo: { $ne: col1docs.mobileNo}, 
    id:{$ne:col1docs.id} }, { mobileNo: 1, _id: 0, id: 1 })
     if (!(col2doc)) {
            Col2.insertMany(col1docs);

            }
        });

I have also tried with $eq instead of $ne but neither i am getting the unmatched documents nor they are getting inserted. Any suggestions??? Combination of id+phoneNo is unique

Upvotes: 1

Views: 906

Answers (1)

whoami - fakeFaceTrueSoul
whoami - fakeFaceTrueSoul

Reputation: 17915

I would say instead of doing two .find() calls plus iteration & then third call to write data, try this query :

db.col1.aggregate([
    {
      $lookup: {
        from: "col2",
        let: { id: "$id", mobileNo: "$mobileNo" },
        pipeline: [
          {
            $match: { $expr: { $and: [ { $eq: [ "$id", "$$id" ] }, { $gte: [ "$mobileNo", "$$mobileNo" ] } ] } }
          },
          { $project: { _id: 1 } } // limiting to `_id` as we don't need entire doc of `col2` - just need to see whether a ref exists or not
        ],
        as: "data"
      }
    },
    { $match: { data: [] } // Filtering leaves docs in `col1` which has no match in `col2`
    },
    { $project: { data: 0, _id: 0 } }
 ])

Test : mongoplayground

Details : From the above query you're taking advantage of specifying conditions in $lookup to get docs from col1 which have reference in col2. Let's say $lookup will run on each document of col1 - So with the unique combination of id & mobileNo from current document in col1 has a matching in col2 then col2 doc's _id will be pushed in data array, at the end what we get out of col1 is data: [] to say no matching docs were found for these col1 doc's. Now you can just write all the returned docs to col2 using .insertMany(). Actually you can do this entire thing using $merge on MongoDB version > 4.2 without any need of 2nd write call(.insertMany()).

For your scenario on MongoDB version > 4.2 something like this will merge docs to second collection :

{ $merge: 'col2' } // Has to be final stage in aggregation

Note : If this has to be done periodically - no matter how you do this, try to minimize data that you're operating on, maybe maintain a time field & you can use that field to filter docs first & do this job, or you can also take advantage of _id to say we've done for all these docs in last run & we need to start from this docs - which helps you a lot to reduce data to be worked on. Additionally don't forget to maintain indexes.

Upvotes: 1

Related Questions