Reputation: 573
I'd like to count all emails within the specific project (ID: 7), but ignoring duplicate rows in ONE campaign.
Here's the example of my collection structure:
{
"_id" : ObjectId("581a9054c274f7b512e8ed94"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 10
}
{
"_id" : ObjectId("581a9064c274f7b512e8ed95"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 10
}
{
"_id" : ObjectId("581a9068c274f7b512e8ed96"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 10
}
{
"_id" : ObjectId("581a906cc274f7b512e8ed97"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 11
}
{
"_id" : ObjectId("581a9072c274f7b512e8ed98"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 11
}
{
"_id" : ObjectId("581a9079c274f7b512e8ed99"),
"email" : "[email protected]",
"IDproject" : 7,
"IDcampaign" : 12
}
This is what the result should be:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Total: 5
(of 6). Note that [email protected]
is mentioned twice. That's because [email protected]
has campaigns 10, 10 and 11. We're ignoring one 10.
This is what I've tried:
db.mycollection.aggregate([
{$match : {IDproject : 7}},
{$group : {_id : "$email", total : {$sum : 1}}}
])
But it returns only unique emails ignoring IDcampaign
. Also, I can get unique number of emails with the following query:
db.mycollection.distinct('email', {IDproject : 7})
But again, it shows only unique emails ignoring IDcampaign
.
Could someone give me a hint how to count emails including IDcampaign
?
Thanks.
p.s. I use MongoDB with PHP, and I can solve the problem with PHP calculations, but that's not the solution.
Upvotes: 1
Views: 1067
Reputation: 103365
Include it as part of your group key, as in the following example:
db.mycollection.aggregate([
{ "$match": { "IDproject": 7 } },
{
"$group": {
"_id": {
"email" : "$email",
"IDcampaign" : "$IDcampaign"
},
"count": { "$sum": 1 }
}
}
])
Sample Output
/* 1 */
{
"_id" : {
"email" : "[email protected]",
"IDcampaign" : 10
},
"count" : 1
}
/* 2 */
{
"_id" : {
"email" : "[email protected]",
"IDcampaign" : 12
},
"count" : 1
}
/* 3 */
{
"_id" : {
"email" : "[email protected]",
"IDcampaign" : 11
},
"count" : 1
}
/* 4 */
{
"_id" : {
"email" : "[email protected]",
"IDcampaign" : 10
},
"count" : 2
}
/* 5 */
{
"_id" : {
"email" : "[email protected]",
"IDcampaign" : 11
},
"count" : 1
}
To answer your follow-up question on getting the count only since you don't need the list of emails, you could run the following pipeline
db.mycollection.aggregate([
{ "$match": { "IDproject": 7 } },
{
"$group": {
"_id": null,
"count": { "$sum": 1 },
"emails": {
"$addToSet": {
"email" : "$email",
"IDcampaign" : "$IDcampaign"
}
}
}
},
{
"$project": {
"_id": 0,
"count": 1,
"total": { "$size": "$emails" }
}
}
])
which gives you the result
{
"total" : 5,
"count" : 6
}
that you can interpret as Total 5 (of 6)
Upvotes: 4