Reputation: 303
I had a bug on my code while developing that created some duplicated users on my MongoDB.
Collection example:
"_id" : ObjectId("5abb9d72b884fb00389efeef"),
"user" : ObjectId("5abb9d72b884fb00389efee5"),
"displayName" : "test",
"fullName" : "test test test",
"email" : "[email protected]",
"phoneNumber" : "99999999999",
"createdAt" : ISODate("2016-05-18T13:49:38.533Z")
I was able to find the duplicated users with this query:
db.users.aggregate([{$group: {_id: "$user", "Total": {$sum: 1}}}, {
$match: { "Total": {$gt: 1}}}])
And count them with this one:
db.users.aggregate([{$group: {_id: "$user", "Total": {$sum: 1}}}, {
$match: { "Total": {$gt: 1}}}, { $count: "Total"}])
I want to know how many users I'll need to delete, but the second query only returns me the total of unique users affected.
How can I get a sum of duplicated users? Or a sum of "Total".
Expected result:
{ "Total" : **** }
Upvotes: 0
Views: 88
Reputation: 61225
Well, you can do this with the following pipeline
[
{ $group: {
_id: null,
uniqueValues: { $addToSet: "$user" },
count: { $sum: 1 }
}},
{ $project: {
total: { $subtract: [ "$count", { $size: "$uniqueValues" } ] }
}}
]
Upvotes: 1
Reputation: 3459
Don't have your data set, so didnt test this in my local. Try this query:
db.users.aggregate([
{$group: {_id: "$user", Total: {$sum: 1}}}, //group by user and count each.
{$addFields: {Total: {$subtract:["$Total",1]}}}, // you need duplicate count, so forget first instance of it.
{$group:{_id:null, Total: {$sum:"$Total"}}}, // your _id is unique, perform a sum out of it
{$project:{_id:0, Total:1}} // at the end the result is total number of 'duplicate' users.
])
Upvotes: 1