Tomas
Tomas

Reputation: 695

Counting distinct values in MongoDB

I try to count distinct values in MongoDB, but I'm not getting the results I expect.

Here is a sample of the data:

{
        "_id" : ObjectId("55d354b0d5cec8aad571b8fe"),
        "version" : "0.4",
        "scan-time" : {
                "start" : 1439913109,
                "end" : 1439913136
        },
        "services" : [
                {
                        "service-type" : "SV1",
                        "service-version" : "V1",
                        "location" : {
                                "ipv4-address" : "192.168.1.1",
                                "protocol" : "tcp"
                        },
                        "classification-method" : "probed"
                },
                {
                        "service-type" : "SV1",
                        "service-version" : "V2",
                        "location" : {
                                "ipv4-address" : "192.168.1.2",
                                "protocol" : "tcp"
                        },
                        "classification-method" : "probed"
                },
                {
                        "location" : {
                                "ipv4-address" : "192.168.1.3",
                                "protocol" : "tcp"
                        },
                        "classification-method" : "probed",
                        "service-type" : "SV1",
                        "service-version" : "V3"
                },
                {
                        "service-type" : "SV2",
                        "service-version" : null,
                        "location" : {
                                "ipv4-address" : "192.168.1.4",
                                "protocol" : "tcp"
                        },
                        "classification-method" : "probed"
                }
        ]
}

I can list all the distinct values using this query:

db.collection.distinct("services.service-type")

However, I also need a count. I tried this, but it doesn't give the results I want:

db.collection.aggregate(
    [
       {
         $group : {
            _id : "$services.service-type",
            count: { $sum: 1 }
         }
       }
    ]
)

I need to see this:

SV1: 3
SV2: 1

MongoDB version:

> db.version()
3.0.5

Thanks for any help!

Upvotes: 1

Views: 3042

Answers (2)

Sede
Sede

Reputation: 61225

First you need to $project your documents and return array of all service-type to reduce the size of the documents you need to process in the pipeline because $unwind duplicates each document in the collection. To do that you use the $map operator. Then you can $unwind the return array and finally in your group project you use the $sum accumulator to return count of distinct service-type

db.collection.aggregate([
    { "$project": { 
        "services": { 
            "$map": { 
                "input": "$services", 
                "as": "s", 
                "in": "$$s.service-type" 
            }
        }
    }}, 
    { "$unwind": "$services" },
    { "$group": { 
        "_id": "$services",  
        "count": { "$sum": 1 } 
    }}
])

Which returns:

{ "_id" : "SV2", "count" : 1 }
{ "_id" : "SV1", "count" : 3 }

Upvotes: 1

Jasper
Jasper

Reputation: 534

To use the nested services array in your distinct query, you need to unwind the $services with $unwind

The following query will output the distinct counts:

db.collection.aggregate([     
    { $unwind: "$services" },
    {
      $group : {
         _id : { "services-service-type": "$services.service-type" },
         count: { $sum: 1 }
      }
    }
])

Output:

{
    "result" : [ 
        {
            "_id" : {
                "services-service-type" : "SV2"
            },
            "count" : 1
        }, 
        {
            "_id" : {
                "services-service-type" : "SV1"
            },
            "count" : 3
        }
    ],
    "ok" : 1
}

Upvotes: 5

Related Questions