Queasy
Queasy

Reputation: 131

Hadoop Adding More Than 1 Core Per Container on Hadoop 2.7

I hear there is a way to add 32 cores or which ever you have for cores to 1 container in Hadoop 2.7 yarn.

Would this be possible and does someone have a sample configuration of what I need to change to achieve this?

The test would be terasort, adding my 40 cores to 1 container job.

Upvotes: 1

Views: 2155

Answers (2)

Manjunath Ballur
Manjunath Ballur

Reputation: 6343

For vCores following are the configurations:

yarn.scheduler.maximum-allocation-vcores - Specifies maximum allocation of vCores for every container request

Typically in yarn-site.xml, you set this value to 32. I think, any value greater than 32 will be rejected by YARN.

  <property>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>32</value>
  </property>

If this value is not set, then YARN RM takes the default value, which is "4"

public static final int DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES = 4;

If you are running a MapReduce application, then you also need to set two more configuration parameters, in mapred-site.xml:

  • mapreduce.map.cpu.vcores - The number of vCores to request from the scheduler for map tasks
  • mapreduce.reduce.cpu.vcores - The number of vCores to request from the scheduler for the reduce tasks

The resource calculation for your mapper/reducer requests is done in the scheduler code. If you want your scheduler to consider both memory and CPUs for resource calculation, then you need to use "DominantResourceCalculator" (which considers both CPU and memory for resource calculation)

For e.g. if you are using Capacity Scheduler, then you need to specify following parameter in "capacity-scheduler.xml" file:

  <property>
    <name>yarn.scheduler.capacity.resource-calculator</name>
    <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
  </property>

Please check this link: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_ig_yarn_tuning.html

This gives a detailed description of various configuration parameters.

Upvotes: 2

vanekjar
vanekjar

Reputation: 2406

Honestly I don't know much about Hadoop 2.7, but if the mapper is able to utilize more threads, number of cores per map (or reduce) container can be set by this setting these properties in mapred-site.xml file:

mapreduce.map.cpu.vcores - The number of virtual cores to request from the scheduler for each map task.

mapreduce.reduce.cpu.vcores - The number of virtual cores to request from the scheduler for each reduce task.

Please refer to the Hadoop documentation

Upvotes: 1

Related Questions