vimal1083
vimal1083

Reputation: 8671

Incorrect response to mapReduce query in mongo-db

I have 1000 user records in collecton, in which 459 document has gender male and remaining as female

//document structure
> db.user_details.find().pretty()
{
    "_id" : ObjectId("557e610d626754910f0974a4"),
    "id" : 0,
    "name" : "Leanne Flinn",
    "email" : "[email protected]",
    "work" : "Unilogic",
    "dob" : "Fri Jun 11 1965 20:50:58 GMT+0530 (IST)",
    "age" : 5,
    "gender" : "female",
    "salary" : 35696,
    "hobbies" : "Acrobatics,Meditation,Music"
}
{
    "_id" : ObjectId("557e610d626754910f0974a5"),
    "id" : 1,
    "name" : "Edward Young",
    "email" : "[email protected]",
    "work" : "Solexis",
    "dob" : "Wed Feb 12 1941 16:45:53 GMT+0530 (IST)",
    "age" : 1,
    "gender" : "female",
    "salary" : 72291,
    "hobbies" : "Acrobatics,Meditation,Music"
}
{
    "_id" : ObjectId("557e610d626754910f0974a6"),
    "id" : 2,
    "name" : "Haydee Milligan",
    "email" : "[email protected]",
    "work" : "Dalserve",
    "dob" : "Tue Sep 13 1994 13:45:04 GMT+0530 (IST)",
    "age" : 17,
    "gender" : "male",
    "salary" : 20026,
    "hobbies" : "Papier-Mache"
}
{
    "_id" : ObjectId("557e610d626754910f0974a7"),
    "id" : 3,
    "name" : "Lyle Keesee",
    "email" : "[email protected]",
    "work" : "Terrasys",
    "dob" : "Tue Apr 25 1922 13:39:46 GMT+0530 (IST)",
    "age" : 79,
    "gender" : "female",
    "salary" : 48032,
    "hobbies" : "Acrobatics,Meditation,Music"
}
{
    "_id" : ObjectId("557e610d626754910f0974a8"),
    "id" : 4,
    "name" : "Shea Mercer",
    "email" : "[email protected]",
    "work" : "Pancast",
    "dob" : "Mon Apr 08 1935 06:10:30 GMT+0530 (IST)",
    "age" : 51,
    "gender" : "male",
    "salary" : 31511,
    "hobbies" : "Acrobatics,Photography,Papier-Mache"
}

Number of users in each gender

> db.user_details.find({gender:'male'}).count()
459
> 
> db.user_details.find({gender:'female'}).count()
541



> db.user_details.find({name:{$ne:null}}).count()
1000
> db.user_details.find({age:{$ne:null}}).count()
1000

Map reduce code

mapper = function(){
  emit(this.gender, {name:this.name,age:this.age})
}

reducer = function(gender, users){
  var res = 0;
  users.forEach(function(user){
    res = res + 1
  })
  return res;
}


db.user_details.mapReduce(mapper, reducer, {out: {inline:1}})

Why map reduce result has only 112 documents? It should contain 459 and 541 for male and female respectively, isn't it?

// Map reduce result
{
  "results" : [
    {
      "_id" : "female",
      "value" : 56
    },
    {
      "_id" : "male",
      "value" : 46
    }
  ],
  "timeMillis" : 45,
  "counts" : {
    "input" : 1000,
    "emit" : 1000,
    "reduce" : 20,
    "output" : 2
  },
  "ok" : 1
}

Note : I know this is not a proper way to use map reduce, Actually i faced some more creepy problem in map reduce. Once i get solution to this question i could solve that

Upvotes: 0

Views: 374

Answers (4)

dipenparmar12
dipenparmar12

Reputation: 3485

This is a proper way to use map reduce(), for display gender-wise count of users

    db.yourCollectionName.mapReduce(
       function(){
           emit(this.gender,1);
       },
       function(k,v){
          return Array.sum(v);
       },
       {out:"genderCount"}
    );
    db.genderCount.find();

Upvotes: 0

Yathish Manjunath
Yathish Manjunath

Reputation: 2029

There is a mistake in the reduce function.

MONGODB reduce function can be called multiple times for the same KEY, so in your reduce code its getting overridden.

Also in map function you are emmitting the document of structure { user, age}, but in reduce function you are returning the count.

  reduce = function(gender, doc) {
                 reducedVal = { user: 0, age: 0 };

                 for (var idx = 0; idx < doc.length; idx++) {
                     reducedVal.user += 1 ;
                     reducedVal.age += 1;
                 }

                 return reducedVal;
              };

please check the below link as well:

http://thejackalofjavascript.com/mapreduce-in-mongodb/

Upvotes: 0

Hitesh Vaghani
Hitesh Vaghani

Reputation: 1362

This is probably wrong.

users.forEach(function(user){
    res = res + 1
  })

Try this,

function(gender, users){
   return Array.sum( users)
}

Upvotes: 0

user3561036
user3561036

Reputation:

Your problem here is that you have missed one of the core concepts of how mapReduce works. The relevant documentation that explains this is found here:

  • MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.

And then also a bit later:

  • the type of the return object must be identical to the type of the value emitted by the map function

What those two statements mean is you need to use the exact same signature issued from both the mapper and the reducer functions as the reduce process will indeed get called "multiple times".

This is how mapReduce deals with large data, but not necessarily processing all of the same values for a given "key" at once, but doing it in incremental "chunks":

There fore if all you want in the output is a "number" then all you "emit" is just a "number" as well:

db.collection.mapReduce(
    function() {
       emit(this.gender, this.age);
    },
    function(key,values) {
        return Array.sum( values )
    },
    { "out": { "inline": 1 } }
)

Or just "count" per type:

db.collection.mapReduce(
    function() {
       emit(this.gender, 1);
    },
    function(key,values) {
        return Array.sum( values )
    },
    { "out": { "inline": 1 } }
)

The point is "you need to put out the same as what you put in", as it will likely "go back in again". So whatever data you want to collect, the output structure for both mapper and reducer must be the same.

Upvotes: 1

Related Questions