Reputation: 2097
Hadoop 2.7 is installed at /opt/pro/hadoop/hadoop-2.7.3
at master, then the whole installation is copied to slave, but different directory /opt/pro/hadoop-2.7.3
. I then update the environment variables (e.g., HADOOP_HOME, hdfs_site.xml for namenode and datanode) at slave machine.
Now I can run hadoop version
at slave successfully. However, in the master, start-dfs.sh
fails with message:
17/02/18 10:24:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: starting namenode, logging to /opt/pro/hadoop/hadoop-2.7.3/logs/hadoop-shijiex-namenode-shijie-ThinkPad-T410.out
master: starting datanode, logging to /opt/pro/hadoop/hadoop-2.7.3/logs/hadoop-shijiex-datanode-shijie-ThinkPad-T410.out
slave: bash: line 0: cd: /opt/pro/hadoop/hadoop-2.7.3: No such file or directory
slave: bash: /opt/pro/hadoop/hadoop-2.7.3/sbin/hadoop-daemon.sh: No such file or directory
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/pro/hadoop/hadoop-2.7.3/logs/hadoop-shijiex-secondarynamenode-shijie-ThinkPad-T410.out
17/02/18 10:26:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
The hadoop uses the HADOOP_HOME
of master(/opt/pro/hadoop/hadoop-2.7.3
) at slave, while the HADOOP_HOME
at slave is /opt/pro/hadoop-2.7.3
.
So should the HADOOP_HOME be the same across nodes when installation?
.bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export PATH=$PATH:/usr/lib/jvm/java-7-openjdk-amd64/bin
export HADOOP_HOME=/opt/pro/hadoop-2.7.3
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
At slave server, $HADOOP_HOME/etc/hadoop has a file masters:
xx@wodaxia:/opt/pro/hadoop-2.7.3/etc/hadoop$ cat masters
master
Upvotes: 2
Views: 1018
Reputation: 18300
No, Not necessarily. But if the paths are different among the nodes, then you cannot use the scripts like start-dfs.sh
, stop-dfs.sh
and the same for yarn
. These scripts refer the $HADOOP_PREFIX
variable of the node where the script is executed.
Snippet of code from hadoop-daemons.sh
used by start-dfs.sh
to start all the datanodes.
exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"
The script is written this way because of the assumption that all the nodes of cluster follow the same $HADOOP_PREFIX
or $HADOOP_HOME
(deprecated) path.
To overcome this,
1) Either try to have the path same across all the nodes.
2) Or login to each node in the cluster and start the dfs process applicable for that node using,
$HADOOP_HOME/sbin/hadoop-daemon.sh start <namenode | datanode | secondarynamenode| journalnode>
Same procedure for yarn as well,
$HADOOP_HOME/sbin/yarn-daemon.sh start <resourcemanager | nodemanager>
Upvotes: 5
Reputation: 8937
No, it should not. $HADOOP_HOME is individual per each Hadoop node, but it can be instantiated by different ways. You can define it in global way by setting it in .bashrc file or it can be set in local hadoop-env.sh script in your Hadoop folder for example. Verify that the values are the same on every node of the cluster. If it is global you can check it by echo $HADOOP_HOME. If it is a script option, you can verify this variable by importing it into current context and checking it again:
. /opt/pro/hadoop/hadoop-2.7.3/bin/hadoop-env.sh
echo $HADOOP_HOME
Besides make sure that you don't have hadoop.home.dir property in your configuration, as it overrides environmental $HADOOP_HOME
Upvotes: 0