Pedro Daumas
Pedro Daumas

Reputation: 303

Find number of duplicates documents

I had a bug on my code while developing that created some duplicated users on my MongoDB.

Collection example:

"_id" : ObjectId("5abb9d72b884fb00389efeef"),   
"user" : ObjectId("5abb9d72b884fb00389efee5"),  
"displayName" : "test",                                               
"fullName" : "test test test",                                        
"email" : "[email protected]",                                            
"phoneNumber" : "99999999999",                                        
"createdAt" : ISODate("2016-05-18T13:49:38.533Z")

I was able to find the duplicated users with this query:

db.users.aggregate([{$group: {_id: "$user", "Total": {$sum: 1}}}, {
   $match: { "Total": {$gt: 1}}}])

And count them with this one:

db.users.aggregate([{$group: {_id: "$user", "Total": {$sum: 1}}}, {
   $match: { "Total": {$gt: 1}}}, { $count: "Total"}])

I want to know how many users I'll need to delete, but the second query only returns me the total of unique users affected.

How can I get a sum of duplicated users? Or a sum of "Total".

Expected result:

{ "Total" : **** }

Upvotes: 0

Views: 88

Answers (2)

Sede
Sede

Reputation: 61225

Well, you can do this with the following pipeline

[
    { $group: {
        _id: null, 
        uniqueValues: { $addToSet: "$user" }, 
        count: { $sum: 1 }
    }}, 
    { $project: { 
        total: { $subtract: [ "$count", { $size: "$uniqueValues" } ] } 
    }} 
]

Upvotes: 1

Rahul Raj
Rahul Raj

Reputation: 3459

Don't have your data set, so didnt test this in my local. Try this query:

db.users.aggregate([
 {$group: {_id: "$user", Total: {$sum: 1}}}, //group by user and count each.
 {$addFields: {Total: {$subtract:["$Total",1]}}}, // you need duplicate count, so forget first instance of it.
 {$group:{_id:null, Total: {$sum:"$Total"}}}, // your _id is unique, perform a sum out of it
 {$project:{_id:0, Total:1}} // at the end the result is total number of 'duplicate' users.
])

Upvotes: 1

Related Questions