Srini Kandula
Srini Kandula

Reputation: 1001

MongoDB aggregate group by sum of distinct column

I have analytics collection with the below sample data.

{ "_id" : ObjectId("55f996a4e4b0cc9c0a392594"), "action" : "apiUploadFile", "assetId" : "55f996a4e4b0cc9c0a392593" },
{ "_id" : ObjectId("5603d384e4b0cf75af10be88"), "action" : "agAsset",  "assetId" : "55f996a4e4b0cc9c0a392593"},
{ "_id" : ObjectId("5603d395e4b0cf75af10becc"), "action" : "aAD", "assetId" : "55f996a4e4b0cc9c0a392593" },
{ "_id" : ObjectId("5603d395e4b0cf75af10becd"), "action" : "mobCmd", "assetId" : "55f996a4e4b0cc9c0a392593", sessionId : "123"},
{ "_id" : ObjectId("5603d395e4b0cf75af10bece"), "action" : "mobCmd", "assetId" : "55f996a4e4b0cc9c0a392593", sessionId : "1234" },
{ "_id" : ObjectId("5603d395e4b0cf75af10becf"), "action" : "mobCmd", "assetId" : "55f996a4e4b0cc9c0a392593", sessionId : "1234" }

I need find sum of analytics group by 'assetId' and then for each 'action' type. I have come up with the below query

db.analytics.aggregate(
    [
    {
        $match : { 
            'assetId' : { "$ne": null }
        }
    },
    {$group :{
            _id:
            {
                assId:'$assetId'
            },
            viewCount:{
                $sum:{
                    $cond: [ { $eq: [ '$action', 'agAsset' ] }, 1, 0 ]
                }
            },
            sessionCount:{
                $sum:{
                    $cond: [ { $eq: [ '$action', 'mobCmd' ] }, 1, 0 ]
                }
            }
        }
    }]
    )

This works great except for the fact that I can not find the 'sessionCount' using distinct 'sessionId'. For example here is the current output

{ "_id" : { "assId" : "55f996a4e4b0cc9c0a392593" }, "viewCount" : 1,  "sessionCount" : 3 }

The expected output is

{ "_id" : { "assId" : "55f996a4e4b0cc9c0a392593" }, "viewCount" : 1,  "sessionCount" : 2 }

I need find the sessionCount for action='mobCmd' and has distinct values for sessionId. How can use distinct inside $sum operation of the 'sessionCount' section?

Upvotes: 1

Views: 1502

Answers (1)

Sede
Sede

Reputation: 61225

You will need to group your documents on a compound _id field.

db.collection.aggregate([
    { "$match": { "assetId": { "$ne": null }}},
    { "$group": { 
        "_id": { "assId": "$assetId", "sessionId": "$sessionId" }, 
        "viewCount": { 
            "$sum": { 
                "$cond": [ 
                    { "$eq": [ "$action", "agAsset" ] }, 
                    1,
                    0 
                ] 
            }
         }, 
         "sessionCount": { 
             "$sum": { 
                 "$cond": [ 
                     { "$eq": [ "$action", "mobCmd" ] }, 
                     1, 
                     0 
                 ] 
             }
         }
     }}
])

Which yields:

{ "_id" : { "assId" : "55f996a4e4b0cc9c0a392593", "sessionId" : "1234" }, "viewCount" : 0, "sessionCount" : 2 }
{ "_id" : { "assId" : "55f996a4e4b0cc9c0a392593", "sessionId" : "123" }, "viewCount" : 0, "sessionCount" : 1 }
{ "_id" : { "assId" : "55f996a4e4b0cc9c0a392593" }, "viewCount" : 1, "sessionCount" : 0 }

Or use the $addToSet operator to return an array of unique sessionId and $unwind the array then regroup your documents.

db.collection.aggregate([
    { "$match": { "assetId": { "$ne": null }}}, 
    { "$group": { 
        "_id": "$assetId", 
        "sessionId": { "$addToSet": "$sessionId" }, 
        "viewCount": { 
            "$sum": { 
                "$cond": [ 
                    { "$eq": [ "$action", "agAsset" ] },
                    1, 
                    0
                ] 
            }
        }
    }}, 
    { "$unwind": "$sessionId" }, 
    { "$group": { 
        "_id": "$_id", 
        "viewCount": { "$first": "$viewCount" },  
        "sessionCount": { "$sum": 1 } 
    }}
])

Which returns:

{ "_id" : "55f996a4e4b0cc9c0a392593", "viewCount" : 1, "sessionCount" : 2 } 

Upvotes: 1

Related Questions