Reputation: 1556
I need to find total count of duplicate profiles per organization level. I have documents as shown below:
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "1"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "75"
}
"_id" : "2"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "3"
},
{
"OrganizationId" : 10,
"Profile" : {
"_id" : "77"
}
"_id" : "4"
}
I have written query which is a group by ProfileId and OrganizationId. The results i am getting as shown below:
Organization Total
10 2
10 2
But i want to get the sum of total per organization level, that means Org 10 should have one row with sum of 4.
The query i am using as shown below:
db.getSiblingDB("dbName").OrgProfile.aggregate(
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } },
{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
{ $match: { c: { $gt: 1 } } });
Any ideas ? Please help
Upvotes: 1
Views: 846
Reputation: 35408
The following pipeline should give you the desired output, whereas the last $project
stage is just for cosmetic purposes to turn _id
into OrganizationId
but is not needed for the essential computation so you may omit it.
db.getCollection('yourCollection').aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.org",
Total: {
$sum: {
$cond: {
if: { $gte: ["$count", 2] },
then: "$count",
else: 0
}
}
}
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
gives this output
{
"Total" : 4.0,
"Organization" : 10
}
To filter out organizations without duplicates you can use $match
which will also result in a simplification of the second $group
stage
...aggregate([
{
$group: {
_id: { org: "$OrganizationId", profile: "$Profile._id" },
count: { $sum: 1 }
}
},
{
$match: {
count: { $gte: 2 }
}
},
{
$group: {
_id: "$_id.org",
Total: { $sum: "$count" }
}
},
{
$project: {
_id: 0,
Organization: "$_id",
Total: 1
}
}
])
Upvotes: 3
Reputation: 1161
I think I have a solution for you. In that last step there, instead of matching, I think you want another $group
.
.aggregate([
{ $project: { _id: 1, P: "$Profile._id", O: "$OrganizationId" } }
,{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} }
,{ $group: { _id: "$_id.o" , c: { $sum: "$c" } }}
]);
You can probably read it and figure out yourself what's happening in that last step, but just in case I'll explain. the last step is group all documents that have the same organization id, and then summing the quantity specified by the previous c
field. After the first group, you had two documents that both had a count c
of 2 but different profile id. The next group ignores the profile id and just groups them if they have the same organization id and adds their counts.
When I ran this query, here is my result, which is what I think you're looking for:
{
"_id" : 10,
"c" : 4
}
Hope this helps. Let me know if you have any questions.
Upvotes: 0