Reputation: 21
I am implementing reduce side Join in Hadoop MapReduce(Java) for that purpose I am using multiple inputs, e.g there are two files Customers and Orders and I joined them considering cid(customer_id).
My Questions :
Upvotes: 0
Views: 396
Reputation: 319
Combiner aggregates mapper output and you can override it with any code you think is better. Combiner is known as a Mini-Reducer and inherits reducer class.
remember that combiner is not guaranteed to run in all cases, so your mapper output should always suffice as a reducer input.
and i dont get your question, despite whatever your mapper input is, mapper output will be some key-value data. combiner just aggregates or simply adds them up, say your mapper output is:
{'ali':2, 'jack':4, 'ali':3}
after combining your output will be:
{'ali':5, 'jack':4}
Upvotes: 0