lacerated
lacerated

Reputation: 395

reducer takes mapper cores

I'm running mapreduce job on hadoop cluster with 88 cores with 60 reducers. For some reason it only uses 79 cores of cluster. At start it runs with 79 mappers but when half splits are done it uses 53 mappers and 26 reducers and number of running mappers continues to shrink later which increases job completion time. Log says these 26 reducers copying calculated data. Is it possible to make hadoop run all mappers first and after that reducers? Like in spark or tez jobs they are using all cores for mapping and after that all of the cores for reducing.

Upvotes: 2

Views: 106

Answers (1)

gudok
gudok

Reputation: 4179

Set mapreduce.job.reduce.slowstart.completedmaps to 1.0. Quote from mapred-default.xml:

mapreduce.job.reduce.slowstart.completedmaps

0.05

Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.

Upvotes: 5

Related Questions