Jayyrus
Jayyrus

Reputation: 13051

How to do this same analysis with MapReduce and Aggregate

i need to make this analysis using mapReduce and/or aggregate:

DBCollection coll = db.getCollection("documents");
DBCursor cursor = coll.find();
Map<String,Integer> map = new HashMap<String,Integer>();
while(cursor.hasNext()){
    DBObject obj = cursor.next();
    BasicDBList list = (BasicDBList)obj.get("cats");
    for(int i=0;i<list.size();i++){
        String cat = list.get(i).toString();
        int hits   = 0;
        if(map.containsKey(cat)){
            hits = map.get(cat);
        }
        hits++;
        map.put(cat, hits);
    }
}

Can someone give me a right example on how to use mapReduce AND aggregate to achive what i need?

Thanks!

Upvotes: 0

Views: 35

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151112

You seem to be counting unique occurrences of elements in an array. Whatever the content it does not matter as you are just casting to a string key in your map. But here's a sample:

{ "cats" : [ 1, 2, 3, 4, 5 ] }
{ "cats" : [ 2, 4 ] }
{ "cats" : [ 1, 5 ] }
{ "cats" : [ 4, 5 ] }

The aggregation framework is the fastest:

db.cats.aggregate([
    { "$unwind": "$cats" },
    { "$group": {
        "_id": "$cats",
        "count": { "$sum": 1 }
    }}
])

Which produces:

{ "_id" : 5, "count" : 3 }
{ "_id" : 4, "count" : 3 }
{ "_id" : 3, "count" : 1 }
{ "_id" : 2, "count" : 2 }
{ "_id" : 1, "count" : 2 }

Map reduce is much the same but slower:

db.cats.mapreduce(
    function() {
        this.cats.forEach(function(cat) {
            emit( cat, 1 );
        });
    },
    function(key,values) {
        return Array.sum( values );
    },
    { "out": { "inline": 1 } }
)

Upvotes: 2

Related Questions