Amin Raeiszadeh
Amin Raeiszadeh

Reputation: 208

is there any way to prevent reduce task starting before all map tasks completing

i want to run many job at the same time on a Hadoop cluster but i want to prevent some jobs to starting reduce phase (making reduce slots busy or reserved) before all map tasks of that job being complete. is there any config for jobs to make theme limit like above?

Thanks.

Upvotes: 1

Views: 761

Answers (2)

Malatesh
Malatesh

Reputation: 1954

You can get default values here for Apache Hadoop mapred.reduce.slowstart.completed.maps has the value 0.05 which is

Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.

Upvotes: 2

Amin Raeiszadeh
Amin Raeiszadeh

Reputation: 208

Reduce slow start By default, schedulers wait until 5% of the map tasks in a job have completed before scheduling reduce tasks for the same job. For large jobs this can cause problems with cluster utilization, since they take up reduce slots while waiting for the map tasks to complete. Setting mapred.reduce.slowstart.completed.maps to a higher value, such as 0.80 (80%), can help improve throughput.

refrence : Hadoop definitive guide 3rd edition , Chapter 9: Setting Up a Hadoop Cluster page:316

Upvotes: 4

Related Questions