Ashwin
Ashwin

Reputation: 607

How to submit a job to specifc nodes in Hadoop?

I have a Hadoop cluster with 1 Master and 5 slaves. Is there any way of submitting jobs to specific set of slaves? Basically what i am trying to do is benchmark my application with many possibilities. So after testing with 5 slaves, I would like to run my application with 4 slaves and then 3 slaves and so on.

Currently the only way I know of is decommissioning a slave and removing from the hadoop cluster. But that seems to be a tedious task. I was wondering if there is an easier approach so as to avoid removing a node from the cluster.

Thanks.

Upvotes: 2

Views: 2296

Answers (3)

Ashwin
Ashwin

Reputation: 607

For those who encounter this problem, comments from Alex and stackoverflow question will help in successfully decommissioning a node from hadoop cluster.

EDIT : Just editing files hdfs-site.xml and mapred-site.xml and executing hadoop dfsadmin -refreshNodes might put your datanode into decommissioning node status for a long time. So it is also necessary to change dfs.replication to an appropriate value.

Upvotes: 0

DDW
DDW

Reputation: 2015

In hadoop/conf there is a file called 'slaves' here you can simply add or remove nodes, and then restart your dfs and mapred.

Upvotes: 2

Alex A.
Alex A.

Reputation: 2736

There is a setting that points to a file with a list of excluded hosts you can set in the mapred-site-xml. Though also a bit cumbersome, changing a single configuration value might be preferable physically decommissioning and recommissioning multiple nodes. You could prepare multiple host exclusion files in advance, change the setting and restart the mapreduce service. Restarting the mapreduce service is pretty quick.

In 0.23 this setting is named mapreduce.jobtracker.hosts.exclude.filename. This is a feature introduced in 0.21, though I believe the setting was named mapred.hosts.exclude then. Check what this setting is called for the version of Hadoop you are using.

Upvotes: 1

Related Questions