Reputation: 25
I have one collection that the structure of the record look like this,
{
_id: ObjectId('66a544dc3a8dff9aaaf65343'),
user_type: "USER",
status: CREATE
}
The explanation of each field:
I want to run the aggregate to get this output
{
{
_id: "ALL",
stat: {
"CREATE" : 2,
"ACTIVE" : 5,
"DELETE" : 4
}
"total": 11
{
_id: "USER",
stat: {
"CREATE" : 1,
"ACTIVE" : 2,
"DELETE" : 0
},
"total": 3
},
{
_id: "ADMIN",
stat: {
"CREATE" : 0,
"ACTIVE" : 3,
"DELETE" : 0
},
"total": 3
},
{
_id: "DEV",
stat: {
"CREATE" : 1,
"ACTIVE" : 0,
"DELETE" : 4
},
"total": 5
}
}
I want to count each occurrence of status for all user_type, compute total number of all status, and create a TOTAL document to count the entire collection
I can easily generate all documents except _id: "ALL" document easily using $group with $project. The aggregate command look like this,
db.getCollection('accounts').aggregate(
[
{
$group: {
_id: '$user_type',
CREATE: {
$sum: {
$cond: [
{
$eq: ['$$ROOT.status', 'CREATE']
},
1,
0
]
}
},
ACTIVE: {
$sum: {
$cond: [
{
$eq: ['$$ROOT.status', 'ACTIVE']
},
1,
0
]
}
},
DELETE: {
$sum: {
$cond: [
{
$eq: ['$$ROOT.status', 'DELETE']
},
1,
0
]
}
},
}
},
{
$project: {
stat: {
CREATE: '$CREATE',
ACTIVE: '$ACTIVE',
DELETE: '$DELETE',
},
total: {
$sum: [
'$CREATE',
'$ACTIVE',
'$DELETE',
]
}
}
}
],
);
The problem arise when I want to generate another document. I can put another $group command (using _id: null), but it will delete all others record (the only thing left will be the record which _id: null)
Is there a way to preserve the previous records in pipeline? or is there an alternative solution which is better that this? any comment is welcome.
Thank you very much!
Upvotes: 1
Views: 53
Reputation: 1426
If you are using mongodb 6 or greater then following code will work as well
db.getCollection("accounts").aggregate([{
$group:{
_id:{user_type:"$user_type",status:"$status"},
count:{$sum:1}
}
},{
$group:{
_id:"$_id.user_type",
total:{$sum:"$count"},
status:{$push:{k:"$_id.status",v:"$count"}}
}
},{
$facet:{
"all":[{
$group:{
_id:"ALL",
total:{$sum:"$total"},
stat:{
$push:"$status"
}
}
},{
$addFields:{
stat:{
$function:{
body:`function(stat){var obj = {};for(let s of stat)for(let o of s)obj[o.k]= +(obj[o.k] ? obj[o.k] : 0) + +o?.v ;return obj}`,
args:["$stat"],
lang:"js"
}
}
}
}],
"others":[{
$addFields:{
stat:{$arrayToObject:"$status"}
}
}]
}
},{
$replaceRoot:{
newRoot:{result:{$concatArrays:["$all","$others"]}}
}
}])
Pros:- You can add new status without any hard coding. There is no need to do same group repeatedly as you can utilise previous information.
Cons:- Can be used with latest versions of mongodb. Need to allow server side javascript. Pre computed pattern is still the winner.
Upvotes: 0
Reputation: 72
Add the following pipelines to the aggregation.
{
$group: {
_id: "ALL",
CREATE: { $sum: "$stat.CREATE" },
ACTIVE: { $sum: "$stat.ACTIVE" },
DELETE: { $sum: "$stat.DELETE" },
total: { $sum: "$total" },
groups: { $push: "$$ROOT" },
},
},
{
$project: {
stat: {
CREATE: "$CREATE",
ACTIVE: "$ACTIVE",
DELETE: "$DELETE",
},
total: 1,
groups: 1,
},
},
In the groups field, you can find the document by $user_type.
Upvotes: 0
Reputation: 15217
I can think of 2 ways to do it.
$facet
to perform the grouping sum twice. This might involve more data wrangling at later stages, but this should be the most performant and the canonical way to do what you expect.db.accounts.aggregate([
{
"$facet": {
"accounts": [
{
"$group": {
"_id": "$user_type",
CREATE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"CREATE"
]
},
1,
0
]
}
},
ACTIVE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"ACTIVE"
]
},
1,
0
]
}
},
DELETE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"DELETE"
]
},
1,
0
]
}
}
}
}
],
"all": [
{
"$group": {
"_id": "ALL",
CREATE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"CREATE"
]
},
1,
0
]
}
},
ACTIVE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"ACTIVE"
]
},
1,
0
]
}
},
DELETE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"DELETE"
]
},
1,
0
]
}
}
}
}
]
}
},
{
"$project": {
"docs": {
"$concatArrays": [
"$accounts",
"$all"
]
}
}
},
{
"$unwind": "$docs"
},
{
"$replaceRoot": {
"newRoot": "$docs"
}
},
{
$project: {
stat: {
CREATE: "$CREATE",
ACTIVE: "$ACTIVE",
DELETE: "$DELETE"
},
total: {
$sum: [
"$CREATE",
"$ACTIVE",
"$DELETE"
]
}
}
}
])
$$unionWith
after your current $group
stage to perform the grouping sum again.db.accounts.aggregate([
{
"$group": {
"_id": "$user_type",
CREATE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"CREATE"
]
},
1,
0
]
}
},
ACTIVE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"ACTIVE"
]
},
1,
0
]
}
},
DELETE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"DELETE"
]
},
1,
0
]
}
}
}
},
{
"$unionWith": {
"coll": "accounts",
"pipeline": [
{
"$group": {
"_id": "ALL",
CREATE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"CREATE"
]
},
1,
0
]
}
},
ACTIVE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"ACTIVE"
]
},
1,
0
]
}
},
DELETE: {
$sum: {
$cond: [
{
$eq: [
"$status",
"DELETE"
]
},
1,
0
]
}
}
}
}
]
}
},
{
$project: {
stat: {
CREATE: "$CREATE",
ACTIVE: "$ACTIVE",
DELETE: "$DELETE"
},
total: {
$sum: [
"$CREATE",
"$ACTIVE",
"$DELETE"
]
}
}
}
])
Strictly speaking, both of them are just workarounds and do not actually address the need to reuse previously computed results. Although not likely, they might incur performance issues since they are grouping twice. If the data does not change frequently or you have more tolerance for stale data, you may consider materializing the computed result and perform further grouping on it.
Upvotes: 0