Honghe.Wu
Honghe.Wu

Reputation: 6799

How to limit the number of map tasks will be run simultaneously on each DataNode

Env:

I config the setting on mapred-site.yml as follow to limit only 3 map tasks running simultaneously:

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>mapreduce.tasktracker.map.tasks.maximum</name>
    <value>3</value>
    <description>The maximum number of map tasks that will be run simultaneously by a task tracker.</description>
</property>
<property>
    <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
    <value>3</value>
    <description>The maximum number of reduce tasks that will be run simultaneously by a task tracker.</description>
</property>

But when I run the TestDFSIO benchmark using the following command, The max actual running map tasks is 8, it seems the setting does not work:

yarn jar /opt/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce- 
client-jobclient-3.0.0-tests.jar \
TestDFSIO -storagePolicy HOT -write \
-nrFiles 500 -fileSize 1000MB -resFile /tmp/DFSIO-write.out

Any help will be appreciated.

Upvotes: 1

Views: 454

Answers (1)

facha
facha

Reputation: 12502

That config parameter is from old Hadoop 1.x. As far as I can see you are using 3.0.0. Try this one:

<property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>3</value>
</property>

You should set it in yarn-site.xml on every host that runs a NodeManager.

Upvotes: 2

Related Questions