Reputation: 11
Updated 01/03/2022 with real data sample
Hello
I've this kind of data (1.5M of lines stored in a mongo server)
I've to calculate the daily percentage of failed device (by date and classes of article) , and also the rate of devices in normal status (also by date and by class of articles).
Data stored in mongo
[{
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 9,
"Status": "OK",
"DateOfExtraction": "11/01/2022 ",
},
{
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 1,
"Status": "Failed",
"DateOfExtraction": "11/01/2022 ",
},
{
"CODE_ARTICLE": "CODE_2",
"NumberOfArticles": 8,
"Status": "Failed",
"DateOfExtraction": "11/01/2022 " :,
},
{
"CODE_ARTICLE": "CODE_2",
"NumberOfArticles": 8,
"Status": "OK",
"DateOfExtraction": "11/01/2022 ",
},
{
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 1,
"Status": "OK",
"DateOfExtraction": "13/01/2022 ",
}
]
Target
{[
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 9,
"Status": "OK",
"DateOfExtraction": "11/01/2022 ",
"percent" : 0.9 #( = (9/(1+9) ) 90% of 'CODE_1' devices are ok the 11/01/2022
},
{
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 1,
"Status": "Failed",
"DateOfExtraction": "11/01/2022 ",
"percent": 0.1 # 1/(1+9) # here are the 10% of failed 'CODE_1' for 11/01/2022
},
{
"CODE_ARTICLE": "CODE_2",
"NumberOfArticles": 8,
"Status": "Failed",
"DateOfExtraction": "11/01/2022 " :,
"percent" : 0.5 # 8/(8+8 # another device : 50% of 'CODE_2' devices are failed this day
},
{
"CODE_ARTICLE": "CODE_2",
"NumberOfArticles": 8,
"Status": "OK",
"DateOfExtraction": "11/01/2022 ",
"percent" : 0.5 # 8/(8+8)
},
{
"CODE_ARTICLE": "CODE_1",
"NumberOfArticles": 1,
"Status": "OK",
"DateOfExtraction": "13/01/2022 ",
"percent" : 1 # no failed devices code_1 this day
}
] ```
Do you have any idea how I can proceed with mongo aggregations ?
Many many thanks !
Upvotes: 0
Views: 256
Reputation: 7588
For those not yet on v5.0, here is an alternative that uses the "group and push $$CURRENT" approach to hang on to all the inputs that go into a group so they can be used later.
db.foo.aggregate([
{$group: {_id: "$DateExtraction", total: {$sum:"$NbOfarticles"},
X: {$push: "$$CURRENT"} }}
,{$unwind: "$X"}
// At this point, you can see all the raw materials for the percentage are
// now at hand. Reading this next stage "inside out", we first compute
// pct, then assign it to key 'pct' in a new object, which we then merge
// with the existing object (effectively, this is 'X.pct = calc'), and
// lastly we turn the full-fledged 'X' object -- which includes '_id' and
// everything else -- into the root doc:
,{$replaceRoot: {newRoot: {$mergeObjects: ["$X",
{"pct":{$round:[{$divide:["$X.NbOfarticles","$total"]},2]}}
]}
}}
]);
Here is a variation that moves the work into a $map
before the $unwind
. Operating on the array X
inside the doc instead of as unwound objects gives us the additional opportunity to cut down the size of the array and thus create fewer new objects, although in this example we do not do that. Note that the heart of the matter (calc pct, set up a new object with just pct:calc
, and merge) is the same for both approaches.
db.foo.aggregate([
{$group: {_id: "$DateExtraction", total: {$sum:"$NbOfarticles"}, X: {$push: "$$CURRENT"} }}
,{$project: {X: {$map: {input: "$X",
in: {$mergeObjects: [ "$$this",
{'pct':{$round:[{$divide:["$$this.NbOfarticles","$total"]},2]} }
]
}
}}
}}
,{$unwind: "$X"}
,{$replaceRoot: {newRoot: "$X"}}
]);
Upvotes: 1
Reputation: 6629
count each document
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "$DateExtraction",
sortBy: {},
output: {
s: {
$sum: "$NbOfarticles",
window: {
documents: [
"unbounded",
"unbounded"
]
}
}
}
}
},
{
"$set": {
"Percent": {
$round: [
{
"$divide": [
"$NbOfarticles",
"$s"
]
},
2
]
}
}
}
])
Upvotes: 1