syko
syko

Reputation: 3637

How to separately specify a set of nodes for HDFS and others for MapReduce jobs?

While deploying hadoop, I want some set of nodes to run HDFS server but not to run any MapReduce tasks.

For example, there are two nodes A and B that run HDFS.

I want to exclude the node A from running any map/reduce task.

How can I achieve it? Thanks

Upvotes: 3

Views: 794

Answers (2)

franklinsijo
franklinsijo

Reputation: 18270

If you do not want to run any MapReduce job in a particular node or a set of nodes,

Stopping the nodemanager daemon would be the simplest option if they are already running. Run this command on the nodes where the MR tasks should not be attempted.

yarn-daemon.sh stop nodemanager

Or exclude the hosts using the property yarn.resourcemanager.nodes.exclude-path in yarn-site.xml

 <property>
    <name>yarn.resourcemanager.nodes.exclude-path</name>
    <value>/path/to/excludes.txt</value>
    <description>Path of the file containing the hosts to exclude. Should be readable by YARN user</description>
 </property>

On adding this property, refresh the resourcemanager

yarn rmadmin -refreshNodes

The nodes specified in the file will be exempted from attempting MapReduce tasks.

Upvotes: 4

syko
syko

Reputation: 3637

I answer my question

  1. If you use Yarn for resource management,

go check franklinsijo's answer.

  1. If you use standalone mode,

make a list of nodes that you will run MR tasks and specify its path as 'mapred.hosts' at mapred-default file. (https://hadoop.apache.org/docs/r1.2.1/mapred-default.html)

Upvotes: 0

Related Questions