Ryan
Ryan

Reputation: 2183

Aggregate totals for Multiple Keys

Given below the following data structure of a document:

{
    "_id" : x,
    "device_model_id": x,
    "device_os_id": x,
    "device_status_id" : x,
}

I would like to group the data by device_status and device_os:

{u'ok': 1.0, u'result': [
    {
        u'_id': {u'device_os_id': 2, u'device_status_id': 2},
        u'total': $sum, 
        u'models': {
            'device_model_id': 1,
            'total_device_model': $sum
        } 
    },
    {
        u'_id': {u'device_os_id': 1, u'device_status_id': 1},
        u'total': $sum, 
        u'models': {
            'device_model_id': 4,
            'total_device_model': $sum
        } 
    },
    .....    

So far I have the following code:

testing = self.collection.aggregate([
    {'$group': {
    '_id': {'device_os_id': "$device_os_id", 'status': '$device_status_id'},
            'total': {'$sum': 1},
            'models': {'$addToSet': "$device_model_id"},
        }}
    ])

which gives me the following result:

{u'ok': 1.0, u'result': [
{u'total': 20355, u'models': [18, 2, 3], u'_id': {u'device_os_id': 2, u'status': 2}}, 
{u'total': 38429, u'models': [18, 19, 2, 3], u'_id': {u'device_os_id': 2, u'status': 1}}, 
{u'total': 1, u'models': [15], u'_id': {u'device_os_id': 1.0, u'status': 1}}]}

I haven't found a way to add the sum of each device_model_id to the 'models' list. Does anyone know how I can acheive this? Many thanks for your help.

Upvotes: 2

Views: 60

Answers (1)

Blakes Seven
Blakes Seven

Reputation: 50426

You do this in two $group pipeline stages, first by all keys including "model" and then just for the lower level of detail:

self.collection.aggregate([
    { '$group': {
        '_id': {
            'device_os_id': "$device_os_id", 
            'status': '$device_status_id',
            'model': '$device_model_id'
        },
        'total': {'$sum': 1}
    }},
    { '$group': {
        '_id': {
            'device_os_id': '$_id.device_os_id', 
            'status': '$_id.status'},
        },
        'models': { '$push': { 'model': '$_id.model', 'total': '$total' } },
        'total': { '$sum': '$total' }
    }}
])

As a "pipeline" the first stage condenses at one level and gets the detail totals. All the "second" stage is doing is folding the content which is already now unique for "model" down into an array of "models" with the other grouping items as a key.

Upvotes: 2

Related Questions