Reputation: 13051
i need to make this analysis using mapReduce and/or aggregate:
DBCollection coll = db.getCollection("documents");
DBCursor cursor = coll.find();
Map<String,Integer> map = new HashMap<String,Integer>();
while(cursor.hasNext()){
DBObject obj = cursor.next();
BasicDBList list = (BasicDBList)obj.get("cats");
for(int i=0;i<list.size();i++){
String cat = list.get(i).toString();
int hits = 0;
if(map.containsKey(cat)){
hits = map.get(cat);
}
hits++;
map.put(cat, hits);
}
}
Can someone give me a right example on how to use mapReduce AND aggregate to achive what i need?
Thanks!
Upvotes: 0
Views: 35
Reputation: 151112
You seem to be counting unique occurrences of elements in an array. Whatever the content it does not matter as you are just casting to a string key in your map. But here's a sample:
{ "cats" : [ 1, 2, 3, 4, 5 ] }
{ "cats" : [ 2, 4 ] }
{ "cats" : [ 1, 5 ] }
{ "cats" : [ 4, 5 ] }
The aggregation framework is the fastest:
db.cats.aggregate([
{ "$unwind": "$cats" },
{ "$group": {
"_id": "$cats",
"count": { "$sum": 1 }
}}
])
Which produces:
{ "_id" : 5, "count" : 3 }
{ "_id" : 4, "count" : 3 }
{ "_id" : 3, "count" : 1 }
{ "_id" : 2, "count" : 2 }
{ "_id" : 1, "count" : 2 }
Map reduce is much the same but slower:
db.cats.mapreduce(
function() {
this.cats.forEach(function(cat) {
emit( cat, 1 );
});
},
function(key,values) {
return Array.sum( values );
},
{ "out": { "inline": 1 } }
)
Upvotes: 2