Reputation: 58
I am trying to compare a large number of documents in two collections. To give you an estimate, I have around 1300 documents in each of the two collections.
I want to generate a diff comparison report after comparing the two collections. I do not need to point out exactly what is missing or what new content has been added, I just need to be able to identify that there is in fact some difference between the two documents. Yes, I do have a unique identifier for each documents other than Mongo's ObjectId ("_id")
.
Note: I have implemented the database using the denormalized data model, which means I have embedded documents (documents within documents).
What would you say is the best way to go about implementing a solution for the same?
Thank you in advance for your time samaritans!
Upvotes: 2
Views: 381
Reputation: 22296
You should use $lookup and $eq on all the fields you care about.
db.collection1.aggregate([
{
$lookup:
{
from: "collection2",
let: { unique_id: "$unique_id", field1: "$field", field2: "$field", ... },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$unique_id_in_2", "$$unique_id" ] }
{ $eq: [ "$field_to_match", "$$field1" ] },
{ $eq: [ "$field_to_match.2", "$$field2" ] }
]
}
}
},
],
as: "matches"
}
},
{
$match: {
'matches.0': {$exists: false}
}
}
])
** mongo 3.6+ syntax for lookup.
Upvotes: 2