bikas
bikas

Reputation: 81

group data by their keys in MapReduce MongoDB

i am trying MapReduce program in MongoDB to find mutual friend, i have following data obtained after sorting the key in mongoDB

{"user" : " Hari","friend" : "Shiva",
 "friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"]}


 {"user" : "Hari","friend" : " Shiva",
  "friendList" : ["Karma"," Tom"," Ram"," Bindu"," Shiva",
                   " Kishna"," Bikash"," Bakshi"," Dinesh"]}

Now here i want to group these data set having same key into single group, using Javascript in map function before send key-value pairs to the reducers, how can i group data? For example i want output like

{"user" : " Hari","friend" : "Shiva",
 "friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"],["Karma"," Tom"," Ram"," Bindu"," Shiva"," Kishna"," Bikash"," Bakshi"," Dinesh"]}

Upvotes: 2

Views: 126

Answers (3)

Murugan Perumal
Murugan Perumal

Reputation: 985

You can do simply aggregation where you can do $group based on user and friend fields.

db.collection.aggregate([
{$group:{
_id:{
       user:'$user',
       friend:'$friend'
    },
    friendList:{$push:'$friendList'}
}},

// project the fields as your wish
{$project:{
    user:'$_id.user',
    friend:'$_id.friend',
    friendList:'$friendList'
}}
])

Hope this aggregation pipeline can return you expected result

Upvotes: 1

Aditya
Aditya

Reputation: 2415

Friend, Why you want to take the pain of grouping the data values for a same key if the map reduce will be performing it by grouping the values of same key and giving to the reduce as key,list[values] ?

I strongly recommend you to perform the grouping task in your reducer instead of Map. the main reason behind it is that since map task reads record by record and performs collect operation, the burden of identifying the same key groups is taken by the algorithm and how to design the output with grouped values can be taken care by us in the reduce logic

you can take the output of the reducer for your further processing.

Input:

{"_id" : {"user" : " Hari","friend" : "Shiva"},
 "value" : {"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"]}}


 {"_id" : {"user" : "Hari","friend" : " Shiva"},
  "value" : {"friendList" : ["Karma"," Tom"," Ram"," Bindu"," Shiva",
                             " Kishna"," Bikash"," Bakshi"," Dinesh"]}}

Mapreduce Code :

var mapper = function () {
    var key = {"user" : this.user, "friend" : this.friend};
    emit(key, {"value":{"friendList":this.friendList}});
};

var reducer = function(key, value){
var combinedfriendList = {"friendList":[]};

    for (var i in values) {
        var inter = values[i];
        for (var j in inter.friendList) {
            combinedfriendList.friendList.push(inter.friendList[j]);
        }
    }
return {"_id": {"user":key.user, "friend": key.friend}, "value":combinedfriendList};
};

Expected output:

{"_id" : {"user" : " Hari","friend" : "Shiva"},
 "value" : {"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh","Karma"," Tom"," Ram"," Bindu"," Shiva"," Kishna"," Bikash"," Bakshi"," Dinesh"]}}

Hope this is for some help.you can test it in your environment (alter if needed ) and share your feedback.

Upvotes: 0

Dhananjay
Dhananjay

Reputation: 554

You can concat the friendlist array of the two records into a single array to create an object like this:

   {
  "_id": {
    "user": " Hari",
    "friend": "Shiva"
  },
  "value": {
    "friendList": [
      "Hanks",
      " Tom",
      " Karma",
      " Hari",
      " Dinesh",
      "Karma",
      " Tom",
      " Ram",
      " Bindu",
      " Shiva",
      " Kishna",
      " Bikash",
      " Bakshi",
      " Dinesh"
    ]
  }
}

See the code at https://jsfiddle.net/b6hxswvk/1/ to create this single object

If you want the friendlist to be a 2 dimensional array i.e. like this:

{
  "_id": {
    "user": " Hari",
    "friend": "Shiva"
  },
  "value": {
    "friendList": [
      [
        "Hanks",
        " Tom",
        " Karma",
        " Hari",
        " Dinesh"
      ],
      [
        "Karma",
        " Tom",
        " Ram",
        " Bindu",
        " Shiva",
        " Kishna",
        " Bikash",
        " Bakshi",
        " Dinesh"
      ]
    ]
  }
}

you can use the code at https://jsfiddle.net/b6hxswvk/2/

Upvotes: 1

Related Questions