Benjamin Harel
Benjamin Harel

Reputation: 2946

Map reduce in mongodb

I have mongo documents in this format.

{"_id" : 1,"Summary" : {...},"Examples" : [{"_id" : 353,"CategoryId" : 4},{"_id" : 239,"CategoryId" : 28}, ...  ]}
{"_id" : 2,"Summary" : {...},"Examples" : [{"_id" : 312,"CategoryId" : 2},{"_id" : 121,"CategoryId" : 12}, ...  ]}

How can I map/reduce them to get a hash like:

{ [ result[categoryId] : count_of_examples , .....] }

I.e. count of examples of each category. I have 30 categories at all, all specified in Categories collection.

Upvotes: 1

Views: 392

Answers (1)

Asya Kamsky
Asya Kamsky

Reputation: 42342

If you can use 2.1 (dev version of upcoming release 2.2) then you can use Aggregation Framework and it would look something like this:

db.collection.aggregate( [
       {$project:{"CatId":"$Examples.CategoryId","_id":0}}, 
       {$unwind:"$CatId"}, 
       {$group:{_id:"$CatId","num":{$sum:1} } },  
       {$project:{CategoryId:"$_id",NumberOfExamples:"$num",_id:0  }} 
] );

The first step projects the subfield of Examples (CategoryId) into a top level field of a document (not necessary but helps with readability), then we unwind the array of examples which creates a separate document for each array value of CatId, we do a "group by" and count them (I assume each instance of CategoryId is one example, right?) and last we use projection again to relabel the fields and make the result look like this:

"result" : [
    {
        "CategoryId" : 12,
        "NumberOfExamples" : 1
    },
    {
        "CategoryId" : 2,
        "NumberOfExamples" : 1
    },
    {
        "CategoryId" : 28,
        "NumberOfExamples" : 1
    },
    {
        "CategoryId" : 4,
        "NumberOfExamples" : 1
    }
],
"ok" : 1

Upvotes: 1

Related Questions