Reputation: 695
I try to count distinct values in MongoDB, but I'm not getting the results I expect.
Here is a sample of the data:
{
"_id" : ObjectId("55d354b0d5cec8aad571b8fe"),
"version" : "0.4",
"scan-time" : {
"start" : 1439913109,
"end" : 1439913136
},
"services" : [
{
"service-type" : "SV1",
"service-version" : "V1",
"location" : {
"ipv4-address" : "192.168.1.1",
"protocol" : "tcp"
},
"classification-method" : "probed"
},
{
"service-type" : "SV1",
"service-version" : "V2",
"location" : {
"ipv4-address" : "192.168.1.2",
"protocol" : "tcp"
},
"classification-method" : "probed"
},
{
"location" : {
"ipv4-address" : "192.168.1.3",
"protocol" : "tcp"
},
"classification-method" : "probed",
"service-type" : "SV1",
"service-version" : "V3"
},
{
"service-type" : "SV2",
"service-version" : null,
"location" : {
"ipv4-address" : "192.168.1.4",
"protocol" : "tcp"
},
"classification-method" : "probed"
}
]
}
I can list all the distinct values using this query:
db.collection.distinct("services.service-type")
However, I also need a count. I tried this, but it doesn't give the results I want:
db.collection.aggregate(
[
{
$group : {
_id : "$services.service-type",
count: { $sum: 1 }
}
}
]
)
I need to see this:
SV1: 3
SV2: 1
MongoDB version:
> db.version()
3.0.5
Thanks for any help!
Upvotes: 1
Views: 3042
Reputation: 61225
First you need to $project
your documents and return array of all service-type
to reduce the size of the documents you need to process in the pipeline because $unwind
duplicates each document in the collection. To do that you use the $map
operator. Then you can $unwind
the return array and finally in your group project you use the $sum
accumulator to return count of distinct service-type
db.collection.aggregate([
{ "$project": {
"services": {
"$map": {
"input": "$services",
"as": "s",
"in": "$$s.service-type"
}
}
}},
{ "$unwind": "$services" },
{ "$group": {
"_id": "$services",
"count": { "$sum": 1 }
}}
])
Which returns:
{ "_id" : "SV2", "count" : 1 }
{ "_id" : "SV1", "count" : 3 }
Upvotes: 1
Reputation: 534
To use the nested services array in your distinct query, you need to unwind the $services with $unwind
The following query will output the distinct counts:
db.collection.aggregate([
{ $unwind: "$services" },
{
$group : {
_id : { "services-service-type": "$services.service-type" },
count: { $sum: 1 }
}
}
])
Output:
{
"result" : [
{
"_id" : {
"services-service-type" : "SV2"
},
"count" : 1
},
{
"_id" : {
"services-service-type" : "SV1"
},
"count" : 3
}
],
"ok" : 1
}
Upvotes: 5