chetan tamboli
chetan tamboli

Reputation: 181

MongoDB aggregate nested grouping

I have Asset collection which has data like

{
     "_id" : ObjectId("5bfb962ee2a301554915"),
     "users" : [
                 "[email protected]",
                 "[email protected]"
          ],
     "remote" : {
                "source" : "dropbox",
                "bytes" : 1234
}
{
     "_id" : ObjectId("5bfb962ee2a301554915"),
     "users" : [
                 "[email protected]",
          ],
     "remote" : {
                "source" : "google_drive",
                "bytes" : 785
}
{
     "_id" : ObjectId("5bfb962ee2a301554915"),
     "users" : [
                 "[email protected]",
                 "[email protected]"
          ],
     "remote" : {
                "source" : "gmail",
                "bytes" : 5647
}

What I am looking for is group by users and get the total of bytes according to its source like

{
    "_id" : "[email protected]",
    "bytes" : {
                 "google_drive": 1458,
                 "dropbox" : 1254
              }
}

I am not getting how to get the nested output using grouping. I have tried with the query

db.asset.aggregate(
     [
        {$unwind : '$users'},
        {$group:{
                 _id:
                    {'username': "$users", 
                    'source': "$remote.source", 
                    'total': {$sum: "$remote.bytes"}} }
        }
    ]
)

This way I am getting the result with the repeated username.

Upvotes: 2

Views: 3388

Answers (2)

Ashh
Ashh

Reputation: 46491

You have to use $group couple of times here. First with the users and the source and count the total number of bytes using $sum.

And second with the users and $push the source and the bytes into an array

db.collection.aggregate([
  { "$unwind": "$users" },
  { "$group": {
    "_id": {
      "users": "$users",
      "source": "$remote.source"
    },
    "bytes": { "$sum": "$remote.bytes" }
  }},
  { "$group": {
    "_id": "$_id.users",
    "data": {
      "$push": {
        "source": "$_id.source",
        "bytes": "$bytes"
      }
    }
  }}
])

And even if you want to convert the source and the bytes into key value format then replace the last $group stage with the below two stages.

{ "$group": {
  "_id": "$_id.users",
  "data": {
    "$push": {
      "k": "$_id.source",
      "v": "$bytes"
    }
  }
}},
{ "$project": {
  "_id": 0,
  "username": "$_id",
  "bytes": { "$arrayToObject": "$data" }
}}

Upvotes: 2

chridam
chridam

Reputation: 103475

With MongoDb 3.6 and newer, you can leverage the use of $arrayToObject operator within a $mergeObjects expression and a $replaceRoot pipeline to get the desired result.

You would need to run the following aggregate pipeline though:

db.asset.aggregate([
    { "$unwind": "$users" },
    { "$group": {
        "_id": { 
            "users": "$users",
            "source": "$remote.source" 
        },
        "totalBytes": { "$sum": "$remote.bytes" }
    } },
    { "$group": {
        "_id": "$_id.users",
        "counts": {
            "$push": {
                "k": "$_id.source",
                "v": "$totalBytes"
            }
        }
    } },
    { "$replaceRoot": { 
        "newRoot": { 
            "$mergeObjects": [ 
                { "bytes": { "$arrayToObject": "$counts" } }, 
                "$$ROOT" 
            ] 
        } 
    } },
    { "$project": { "counts": 0 } }   
])

which yields

/* 1 */
{
    "bytes" : {
        "gmail" : 5647.0,
        "dropbox" : 1234.0
    },
    "_id" : "[email protected]"
}

/* 2 */
{
    "bytes" : {
        "google_drive" : 785.0
    },
    "_id" : "[email protected]"
}

/* 3 */
{
    "bytes" : {
        "gmail" : 5647.0,
        "dropbox" : 1234.0
    },
    "_id" : "[email protected]"
}

using the above sample documents.

Upvotes: 2

Related Questions