Reputation: 161
I have a use case where in certain keys that map phase generates need to be filtered out before the reduce kicks in. Is something like this doable? Please let me know.
Upvotes: 1
Views: 906
Reputation: 19867
A couple of options that come to mind:
Using a combiner is not a good choice for this task because, as @100gods mentions, combiner execution is not guaranteed.
Upvotes: 1
Reputation: 1353
Modifying the Mapper Class to filter the input will be more accurate , because , the execution of combiner is not guaranteed, Hadoop may or may not execute a combiner. Also, if required it may execute it more then 1 times. Therefore your MapReduce jobs should not depend on the combiners execution.
Upvotes: 1