Reputation: 8671
I have 1000 user records in collecton, in which 459 document has gender male and remaining as female
//document structure
> db.user_details.find().pretty()
{
"_id" : ObjectId("557e610d626754910f0974a4"),
"id" : 0,
"name" : "Leanne Flinn",
"email" : "[email protected]",
"work" : "Unilogic",
"dob" : "Fri Jun 11 1965 20:50:58 GMT+0530 (IST)",
"age" : 5,
"gender" : "female",
"salary" : 35696,
"hobbies" : "Acrobatics,Meditation,Music"
}
{
"_id" : ObjectId("557e610d626754910f0974a5"),
"id" : 1,
"name" : "Edward Young",
"email" : "[email protected]",
"work" : "Solexis",
"dob" : "Wed Feb 12 1941 16:45:53 GMT+0530 (IST)",
"age" : 1,
"gender" : "female",
"salary" : 72291,
"hobbies" : "Acrobatics,Meditation,Music"
}
{
"_id" : ObjectId("557e610d626754910f0974a6"),
"id" : 2,
"name" : "Haydee Milligan",
"email" : "[email protected]",
"work" : "Dalserve",
"dob" : "Tue Sep 13 1994 13:45:04 GMT+0530 (IST)",
"age" : 17,
"gender" : "male",
"salary" : 20026,
"hobbies" : "Papier-Mache"
}
{
"_id" : ObjectId("557e610d626754910f0974a7"),
"id" : 3,
"name" : "Lyle Keesee",
"email" : "[email protected]",
"work" : "Terrasys",
"dob" : "Tue Apr 25 1922 13:39:46 GMT+0530 (IST)",
"age" : 79,
"gender" : "female",
"salary" : 48032,
"hobbies" : "Acrobatics,Meditation,Music"
}
{
"_id" : ObjectId("557e610d626754910f0974a8"),
"id" : 4,
"name" : "Shea Mercer",
"email" : "[email protected]",
"work" : "Pancast",
"dob" : "Mon Apr 08 1935 06:10:30 GMT+0530 (IST)",
"age" : 51,
"gender" : "male",
"salary" : 31511,
"hobbies" : "Acrobatics,Photography,Papier-Mache"
}
Number of users in each gender
> db.user_details.find({gender:'male'}).count()
459
>
> db.user_details.find({gender:'female'}).count()
541
> db.user_details.find({name:{$ne:null}}).count()
1000
> db.user_details.find({age:{$ne:null}}).count()
1000
Map reduce code
mapper = function(){
emit(this.gender, {name:this.name,age:this.age})
}
reducer = function(gender, users){
var res = 0;
users.forEach(function(user){
res = res + 1
})
return res;
}
db.user_details.mapReduce(mapper, reducer, {out: {inline:1}})
Why map reduce result has only 112 documents? It should contain 459 and 541 for male and female respectively, isn't it?
// Map reduce result
{
"results" : [
{
"_id" : "female",
"value" : 56
},
{
"_id" : "male",
"value" : 46
}
],
"timeMillis" : 45,
"counts" : {
"input" : 1000,
"emit" : 1000,
"reduce" : 20,
"output" : 2
},
"ok" : 1
}
Note : I know this is not a proper way to use map reduce, Actually i faced some more creepy problem in map reduce. Once i get solution to this question i could solve that
Upvotes: 0
Views: 374
Reputation: 3485
This is a proper way to use map reduce(), for display gender-wise count of users
db.yourCollectionName.mapReduce(
function(){
emit(this.gender,1);
},
function(k,v){
return Array.sum(v);
},
{out:"genderCount"}
);
db.genderCount.find();
Upvotes: 0
Reputation: 2029
There is a mistake in the reduce function.
MONGODB reduce function can be called multiple times for the same KEY, so in your reduce code its getting overridden.
Also in map function you are emmitting the document of structure { user, age}, but in reduce function you are returning the count.
reduce = function(gender, doc) {
reducedVal = { user: 0, age: 0 };
for (var idx = 0; idx < doc.length; idx++) {
reducedVal.user += 1 ;
reducedVal.age += 1;
}
return reducedVal;
};
please check the below link as well:
http://thejackalofjavascript.com/mapreduce-in-mongodb/
Upvotes: 0
Reputation: 1362
This is probably wrong.
users.forEach(function(user){
res = res + 1
})
Try this,
function(gender, users){
return Array.sum( users)
}
Upvotes: 0
Reputation:
Your problem here is that you have missed one of the core concepts of how mapReduce works. The relevant documentation that explains this is found here:
- MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.
And then also a bit later:
- the type of the return object must be identical to the type of the value emitted by the map function
What those two statements mean is you need to use the exact same signature issued from both the mapper and the reducer functions as the reduce process will indeed get called "multiple times".
This is how mapReduce deals with large data, but not necessarily processing all of the same values for a given "key" at once, but doing it in incremental "chunks":
There fore if all you want in the output is a "number" then all you "emit" is just a "number" as well:
db.collection.mapReduce(
function() {
emit(this.gender, this.age);
},
function(key,values) {
return Array.sum( values )
},
{ "out": { "inline": 1 } }
)
Or just "count" per type:
db.collection.mapReduce(
function() {
emit(this.gender, 1);
},
function(key,values) {
return Array.sum( values )
},
{ "out": { "inline": 1 } }
)
The point is "you need to put out the same as what you put in", as it will likely "go back in again". So whatever data you want to collect, the output structure for both mapper and reducer must be the same.
Upvotes: 1