Reputation: 936
I have 12 node cluster. Its Hardware information are :
NameNode : CPU Core i3 2.7 Ghz | 8GB RAM | 500 GB HDD
DataNode : CPU Core i3 2.7 Ghz | 2GB RAM | 500 GB HDD
I have installed the hadoop 2.7.2. I am using normal hadoop installation process on ubuntu and it work fine. But I want to add client machine.and I have no such clue that how to add client machine.
Question :
Upvotes: 3
Views: 11383
Reputation: 3849
Client should have same copy of Hadoop Distribution and configuration which is present at Namenode then Only Client will come to know on which node Job tracker/Resourcemanager is running, and IP of Namenode to access HDFS data.
Also you need to update /etc/hosts
of client machine with IP addresses and hostnames of namenode and datanode.
Note that, you shouldn’t start any hadoop service on client machine.
Steps to follow on client machine:
user1
user1
/home/user1/hadoop-2.x.x
JAVA_HOME
, HADOOP_HOME
(/home/user1/hadoop-2.x.x
)export PATH=$HADOOP_HOME/bin:$PATH
test it out: hadoop fs -ls /
which should list the root directory of the cluster hdfs.
you may face some issues like privileges, may need to set JAVA_HOME places like conf/hadoop-env.sh
on client machine. update/comment any error you get.
Answers to more questions from comments:
hadoop fs
commands from client machine: hadoop fs -put /home/user1/data/* /user/user1/data
- you can also write shell-scripts that would run these command(s) if you need to run them many times.Why I am installing hadoop on the client if we only use ssh
to connect remotely to the master node ?
ssh
to
connect, but you are performing some operations on hadoop cluster from
client node so you would need hadoop binaries. ssh
is used by
hadoop binaries on client node, when you run such operations like hadoop fs
-ls/
from client node to cluster. (remember adding $HADOOP_HOME/bin
to PATH
as part of installation process above)ssh
" - that sounds to me like when you want to make changes/access hadoop configuration files from cluster you are connecting using ssh
to cluster nodes - you do this as part of administrative work but when you need to run hadoop commands/jobs against cluster from client node you dont need to ssh
manually - hadoop installation on client node will take care of it.3. should user name 'user1' must be same ? what if it is different ? - it will work. you can install hadoop on client node under group user say: qa
or dev
, and all users on client node as sudo under that group. than when user1
on client node need to run any hadoop job on cluster: user1
should be able to sudo -i -u qa
and then run hadoop command from it.
Upvotes: 9