Reputation: 23
I need help to get the array element having maximum value of a field(level) from a document. Then count the total occurences grouped by array element field "bssid".
Consider the following data
/* 1 */
{
"_id" : "18:59:36:0c:94:a3",
"timestamp" : "1460012567",
"apdata" : [{
"bssid" : "f4:b7:e2:56:e4:20",
"ssid" : "Test Network2",
"level" : -55
}, {
"bssid" : "b8:a3:86:67:03:56",
"ssid" : "Test Network1",
"level" : -76
}]
}
/* 2 */
{
"_id" : "d0:b3:3f:b9:42:38",
"timestamp" : "1460013345",
"apdata" : [{
"bssid" : "f4:b7:e2:56:e4:20",
"ssid" : "Test Network2",
"level" : -65
}, {
"bssid" : "b8:a3:86:67:03:56",
"ssid" : "Test Network1",
"level" : -46
}]
}
/* 3 */
{
"_id" : "d0:b3:3f:b9:42:41",
"timestamp" : "1460013145",
"apdata" : [{
"bssid" : "f4:b7:e2:56:e4:20",
"ssid" : "Test Network2",
"level" : -65
}, {
"bssid" : "b8:a3:86:67:03:56",
"ssid" : "Test Network1",
"level" : -46
}]
}
The output required is
{
"bssid" : "f4:b7:e2:56:e4:20",
"ssid" : "Test Network2",
"count" : 1
}, {
"bssid" : "b8:a3:86:67:03:56",
"ssid" : "Test Network1",
"count" : 2
}
Which is the count of times each bssid had the maximum value within the array of each document over the whole collection.
Upvotes: 2
Views: 2506
Reputation: 151170
If you have MongoDB 3.2 available then you can do something like this:
db.sample.aggregate([
{ "$project": {
"apdata": {
"$arrayElemAt": [
{ "$filter": {
"input": "$apdata",
"as": "el",
"cond": {
"$eq": [
"$$el.level",
{ "$max": {
"$map": {
"input": "$apdata",
"as": "data",
"in": "$$data.level"
}
}}
]
}
}},
0
]
}
}},
{ "$group": {
"_id": "$apdata.bssid",
"ssid": { "$first": "$apdata.ssid" },
"count": { "$sum": 1 }
}}
])
For at least MongoDB 2.6 you need to do this:
db.sample.aggregate([
{ "$unwind": "$apdata" },
{ "$group": {
"_id": "$_id",
"apdata": { "$push": "$apdata" },
"max": { "$max": "$apdata.level" }
}},
{ "$unwind": "$apdata" },
{ "$redact": {
"$cond": {
"if": { "$eq": [ "$apdata.level", "$max" ] },
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$group": {
"_id": "$apdata.bssid",
"ssid": { "$first": "$apdata.ssid" },
"count": { "$sum": 1 }
}}
])
And for MongoDB 2.4 or 2.2 like this:
db.sample.aggregate([
{ "$unwind": "$apdata" },
{ "$group": {
"_id": "$_id",
"apdata": { "$push": "$apdata" },
"max": { "$max": "$apdata.level" }
}},
{ "$unwind": "$apdata" },
{ "$project": {
"apdata": 1,
"isMax": { "$eq": [ "$apdata.level", "$max" ] }
}},
{ "$match": { "isMax": true } },
{ "$group": {
"_id": "$apdata.bssid",
"ssid": { "$first": "$apdata.ssid" },
"count": { "$sum": 1 }
}}
])
In all cases $max
is used to get the "maximum" value of of the array in each document "first", then you can use that to "filter" the array content prior to using it in a $group
. The approaches to this only vary with version
MongoDB 3.2: Allows the $max
to work directly on an "array" of values. So the $map
is used to just get the "level"
values and find out what that "max" actually is.
Then the $filter
can be used to just return the array element which matches that "max" value, and finally $arrayElemAt
is used to return that "only" ( out of two possible and "zero" index ) element as a plain document.
The whole process can be done in $group
"only" if you basically repeat that whole statement for both the _id
and in order to get the $first
"ssid"
value, but it's a bit easier to write in a $project
separately to demonstrate.
MongoDB 2.6: This lacks the fancier operators and most notably the ability of $max
to work "directly" on an array. The notable thing is the need to $unwind
the array first and then actually $group
just on the original document, solely in order to get that "max" value.
Then the process really needs you to $unwind
again since you will be grouping on the element from the array later, and then use $redact
to filter the content. This is a "logical" form of $match
where you can directly compare the "level"
against the computed "max" from the earlier stage. So the element that is not the "max" is removed.
MongoDB 2.4: Is again basically the same logic, except instead of $redact
you actually need the physical $project
in order to put a field in the document to use in filtering with $match
.
All versions have the same final $group
, where you supply the path to "apdata.bssid"
for the grouping key and the $first
result on that grouping boundary for the "ssid"
and a simple $sum
to count the occurrences of the grouping key in the results.
Everything returns just as follows:
{ "_id" : "f4:b7:e2:56:e4:20", "ssid" : "Test Network2", "count" : 1 }
{ "_id" : "b8:a3:86:67:03:56", "ssid" : "Test Network1", "count" : 2 }
Actually the most "efficient" form for MongoDB 3.2 would be as follows:
db.sample.aggregate([
{ "$group": {
"_id": {
"$arrayElemAt": [
{ "$map": {
"input": {
"$filter": {
"input": "$apdata",
"as": "el",
"cond": {
"$eq": [
"$$el.level",
{ "$max": {
"$map": {
"input": "$apdata",
"as": "data",
"in": "$$data.level"
}
}}
]
}
}
},
"as": "apdata",
"in": {
"bssid": "$$apdata.bssid",
"ssid": "$$apdata.ssid"
}
}},
0
]
},
"count": { "$sum": 1 }
}}
])
With a slightly different form due to the compound _id
, but it is a single $group
stage only, without repetition of the whole process to find the array element data for the "max" value:
{
"_id" : {
"bssid" : "b8:a3:86:67:03:56",
"ssid" : "Test Network1"
},
"count" : 2
}
{
"_id" : {
"bssid" : "f4:b7:e2:56:e4:20",
"ssid" : "Test Network2"
},
"count" : 1
}
Upvotes: 1