Sandeep Patil
Sandeep Patil

Reputation: 21

How combiner works when we use multiple inputs in Hadoop MapReduce

I am implementing reduce side Join in Hadoop MapReduce(Java) for that purpose I am using multiple inputs, e.g there are two files Customers and Orders and I joined them considering cid(customer_id).

My Questions :

  1. In the above program if I write combiner class how is it going to work, as far as I know combiner is mapper level aggregator, however in this case we have two mapper logics.
  2. Will the combiner logic be applied to both mapper logics
  3. Is there any way using which I can apply combiner logic to any one mapper logic

Upvotes: 0

Views: 396

Answers (1)

Aref Khandan
Aref Khandan

Reputation: 319

Combiner aggregates mapper output and you can override it with any code you think is better. Combiner is known as a Mini-Reducer and inherits reducer class.

remember that combiner is not guaranteed to run in all cases, so your mapper output should always suffice as a reducer input.

and i dont get your question, despite whatever your mapper input is, mapper output will be some key-value data. combiner just aggregates or simply adds them up, say your mapper output is:

{'ali':2, 'jack':4, 'ali':3}

after combining your output will be:

{'ali':5, 'jack':4}

Upvotes: 0

Related Questions