Reputation: 699
I have a mongodb collection. I want to remove duplicate docs if two key fields are duplicated.
db.getCollection("collection").aggregate([
{
// only match documents that have this field
// you can omit this stage if you don't have missing fieldX
$match: {"user_id": {$nin:[null]}}
},
{
$group: { "_id": "$user_id", "doc" : {"$first": "$$ROOT"}}
},
{
$replaceRoot: { "newRoot": "$doc"}
},
{$out: "collection2"}
],
{allowDiskUse:true}
)
The query above works for one key field. from this solution
for 2 fields, how can I edit it?
Sample collection;
repo_id user_id
0 667006 1060
1 667006 1060 #duplicated ! repo_id and user_id
2 667006 2467194
3 667006 21979
Desired output;
repo_id user_id
0 667006 1060
1 667006 2467194
2 667006 21979
Upvotes: 0
Views: 127
Reputation: 17915
All you need to change is $group
stage, now group on unique pairs of repo_id
& user_id
.
Try to replace group stage with below :
{
$group: { _id: {repo_id: '$repo_id',user_id: "$user_id"} , doc: { $first: "$$ROOT" } }
}
Test : mongoplayground
Upvotes: 1