hadoop conf/masters and conf/slaves on jobtracker?

Question

In a hadoop cluster (1.x version) where the NameNode and JobTracker are not the same server, does conf/masters and conf/slaves need to be specified on both the NameNode and the JobTracker or just on the NameNode? I couldn't seem to find a direct answer to this in the docs.

Chris White · Accepted Answer

The slaves and masters files in the conf folder are only used by the start-mapred.sh, start-dfs.sh and start-all.sh scripts in the bin folder. These scripts are convenience scripts such that you can run them on a single node to ssh into each master / slave node and start the desired hadoop service daemons. These scripts are also meant to be launched from the appropriate 'master' node:

start-dfs.sh - started from the node you want to be the Name Node
start-mapred.sh - started from the node you want to be the Job Tracker
start-all.sh - Delegates to the above scripts, and should be run from the node you want to be both the Name Node and Job Tracker

The slaves file lists all the compute node hostnames (that is the nodes that you want to run both a Data Node and Task Tracker service on), while the masters file contains the hostname of the node to run the secondary name node on.

With this in mind you only need the slaves and masters file to be present on the Name Node, and that's only if you plan to launch your cluster from this single node (using password-less ssh).

hadoop conf/masters and conf/slaves on jobtracker?

Answers (1)

Related Questions