sithmichel
sithmichel

Reputation: 11

Reducer doesn't start still progress on MapReduce Job

If reducers do not start before all mappers finish then why does the progress on MapReduce job shows something like Map(50%) Reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?

Upvotes: 0

Views: 673

Answers (2)

softinx
softinx

Reputation: 11

Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The progress calculation also takes in account the processing of data transfer which is done by reduce process, therefore the reduce progress starts showing up as soon as any intermediate key-value pair for a mapper is available to be transferred to reducer. Though the reducer progress is updated still the programmer defined reduce method is called only after all the mappers have finished

Upvotes: 0

suresiva
suresiva

Reputation: 3173

Its is because of the mapreduce.job.reduce.slowstart.completedmaps property which's default value is 0.05.

It means that the reducer phase will be started as soon as atleast 5% of total mappers have completed the execution.

So the dispatched reducers will continue to stay in copy phase until all mappers are completed.

If you wish to start reducers only after all mappers have completed, then configure 1.0 value for the given property in the job configuration.

Upvotes: 2

Related Questions