Reputation: 11
I am writing a small hadoop program in java, my requirement is to do two Emits from a single Map method and handle both the Emits in a single Reduce method. Is this possible ? If possible, how do I differentiate between the two Emits so that I can handle both of them differently in my Reduce method ? I did lot of searches on this, but couldnt get anything concrete. I am not allowed to use any external libraries.
Upvotes: 1
Views: 3245
Reputation: 3868
If you want to differentiate different emits from map at reducer side, you can
1) keep same keys for all emits, tag values
2) tag keys for different emits, tag values (this is useful if you want to group/order of some part of key at reducer side) for this
read following:
What is the use of grouping comparator in hadoop map reduce
http://www.bigdataspeak.com/2013/02/hadoop-how-to-do-secondary-sort-on_25.html
Upvotes: 0
Reputation: 10642
You can output as little or as many records as you need from a single "Map" call.
When you need to have several of those record handled by a single "Reduce" call you simply make sure they have the same key and the Hadoop framework will make sure they will be fed into the same reducer call.
Please note that the reducer may receive the key-value pairs in a different order to how you outputted them.
Upvotes: 1
Reputation: 33495
A map/reduce tasks takes key/value as input. Value need not be a string as in most of the examples like WordCount, it can be a complex structure also.
You can have a structure with two fields corresponding to the two emits and that key/value pair will be automatically sent to one reducer.
Upvotes: 1