Reputation: 565
I have a fairly simple-sounding task I'd like to achieve using MongoDB's aggregation pipeline. I want to treat the arrays in one field as sets (i.e., disregarding order and duplicate), and group by them. As an example, the collection might be:
[
{
_id: 1
names: ["a", "b"]
},
{
_id: 2
names: ["c", "a"]
},
{
_id: 3
names: ["b", "a"]
}
]
And the result I want back is something like:
[
{
names: ["a", "b"],
count: 2
},
{
names: ["a", "c"],
count: 1
}
]
Thanks!
Upvotes: 2
Views: 70
Reputation: 8978
You can definitely get your result by stitching together multiple aggregation pipelines.
db.collection.aggregate([
{$unwind:"$names"},
{$sort:{_id:1, names:1}},
{$group:{_id:"$_id", names:{$push:"$names"}}},
{$group:{_id:"$names", count:{$sum:1}}},
{$project:{_id:0, names:"$_id", count:1}}
])
It emits:
{
"count" : NumberInt(1),
"names" : [
"a",
"c"
]
}
{
"count" : NumberInt(2),
"names" : [
"a",
"b"
]
}
Upvotes: 1
Reputation: 50406
You need to $sort
the results to make them consistent for a grouping key. There really is no other way:
db.collection.aggregate([
{ "$unwind": "$names" },
{ "$sort": { "_id": 1, "names": 1} },
{ "$group": {
"_id": "$_id",
"names": { "$push": "$names" }
}},
{ "$group": {
"_id": "$names",
"count": { "$sum": 1 }
}}
])
Returns just like you ask:
[
{
"_id": ["a", "b"],
"count": 2
},
{
"_id": ["a", "c"],
"count": 1
}
]
Whilst there are quite a few operators that work on array like "sets", none of them "reorder" the array content into a consistent way that would apply when grouping. This is only ever done when you $sort
.
Even if arrays contained "duplicates", and had some set transformation applied they are still not consistently ordered:
db.testa.insert_many([
{ "a" : [ "a", "b" ] },
{ "a" : [ "b", "a" ] },
{ "a" : [ "b", "a", "a" ] }
])
db.testa.aggregate({ "$project": { "_id": 0, "a": { "$setUnion": [ "$a", [] ] } } })
That sample returns of course:
{ "a" : [ "b", "a" ] }
{ "a" : [ "a", "b" ] }
{ "a" : [ "a", "b" ] }
So you would "still" need to $unwind
and $sort
in order to get a consistent "set" for grouping purposes.
Upvotes: 1