Reputation: 303
I somehow created duplicates of every single entry in my database. Currently, there are 176039 documents and counting, half are duplicates. Each document is structured like so
_id : 5b41d9ccf10fcf0014fe8917
originName : "Hartsfield Jackson Atlanta International Airport"
destinationName : "Antigua"
totalDuration : 337
Inside the MongoDB Compass Community App for Mac under the Aggregations tab, I was able to find duplicates using this pipeline
[
{$group: {
_id: {originName: "$originName", destinationName: "$destinationName"},
count: {$sum: 1}}},
{$match: {count: {"$gt": 1}}}
]
I'm not sure how to move forward and delete the duplicates at this point. I'm assuming it has something to do with $out
.
Edit: Something I didn't notice until now is that the values for totalDuration on each double are actually different.
Upvotes: 0
Views: 1255
Reputation: 75934
Add
{$project:{_id:0, "originName":"$_id.originName", "destinationName":"$_id.destinationName"}},
{ $out : collectionname }
This will replace the documents in your current collection with documents from aggregation pipeline. If you need totalDuration in the collection then add that field in both group and project stage before running the pipeline
Upvotes: 2