Sandeep
Sandeep

Reputation: 349

Adding Data Node to hadoop cluster

When I start the hadoopnode1 by using start-all.sh, it successfully start the services on master and slave (see jps command output for slave). But when I try to see the live nodes in admin screen slave node doesn't show up. Even when I run the hadoop fs -ls / command from master it runs perfectly, but from salve it shows error message

@hadoopnode2:~/hadoop-0.20.2/conf$ hadoop fs -ls /
12/05/28 01:14:20 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 0 time(s).
12/05/28 01:14:21 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 1 time(s).
12/05/28 01:14:22 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 2 time(s).
12/05/28 01:14:23 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 3 time(s).
.
.
.
12/05/28 01:14:29 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 10 time(s).

It looks like slave (hadoopnode2) is not being able to find/connect the master node(hadoopnode1)

Please point me what I am missing?

Here are the setting from Master and Slave nodes - P.S. - Master and slave running same version of Linux and Hadoop and SSH is working perfectly, because I can start the slave from master node

Also Same settings for core-site.xml, hdfs-site.xml and mapred-site.xml on master(hadooopnode1) and slave (hadoopnode2)

OS - Ubuntu 10 Hadoop Version -

oop@hadoopnode1:~/hadoop-0.20.2/conf$ hadoop version
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010

-- Master (hadoopnode1)

hadoop@hadoopnode1:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode1 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux

hadoop@hadoopnode1:~/hadoop-0.20.2/conf$ jps
9923 Jps
7555 NameNode
8133 TaskTracker
7897 SecondaryNameNode
7728 DataNode
7971 JobTracker

masters -> hadoopnode1
slaves -> hadoopnode1
hadoopnode2

--Slave (hadoopnode2)

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode2 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ jps
1959 DataNode
2631 Jps
2108 TaskTracker

masters - hadoopnode1

core-site.xml
hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/var/tmp/hadoop/hadoop-${user.name}</value>
                <description>A base for other temp directories</description>
        </property>

        <property>
                <name>fs.default.name</name>
                <value>hdfs://hadoopnode1:8020</value>
                <description>The name of the default file system</description>
        </property>

</configuration>

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>mapred.job.tracker</name>
                <value>hadoopnode1:8021</value>
                <description>The host and port that the MapReduce job tracker runs at.If "local", then jobs are run in process as a single map</description>
        </property>
</configuration>

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
                <description>Default block replication</description>
        </property>
</configuration>

Upvotes: 5

Views: 20385

Answers (6)

vpap
vpap

Reputation: 1547

At the web GUI you can see the number of nodes your cluster has. If you see less than you expected, then make sure that /etc/hosts file at master has as hosts only( for 2 node cluster).

192.168.0.1 master
192.168.0.2 slave

If you see any 127.0.1.1.... ip then comment out, because Hadoop will see them first as host( s).

Upvotes: 1

prem
prem

Reputation: 604

check your service by sudo jps the master should not be displayed what you need to do

Restart Hadoop
Go to /app/hadoop/tmp/dfs/name/current
Open VERSION (i.e. by vim VERSION)
Record namespaceID
Go to /app/hadoop/tmp/dfs/data/current
Open VERSION (i.e. by vim VERSION)
Replace the namespaceID with the namespaceID you recorded in step 4.

this should work.Best of luck

Upvotes: 1

William Yao
William Yao

Reputation: 1

Indeed there are two errors in your case.

can't connect to hadoop master node from slave

That's network problem. Test it: curl 192.168.1.120:8020 .

Normal Response: curl: (52) Empty reply from server

In my case, I get host not found error. So just take a look at firewall settings

data node down:

That's hadoop problem. Raze2dust's method is good. Here's a another way if you see Incompatible namespaceIDs error in your log:

stop hadoop and edit the value of namespaceID in /current/VERSION to match the value of the current namenode, then start hadoop.

You can always check available datanodes using: hadoop fsck /

Upvotes: 0

Black_Rider
Black_Rider

Reputation: 1575

Add new node hostname to slaves file and start data node & task tracker on new node.

Upvotes: 0

Sandeep
Sandeep

Reputation: 546

I think in slave 2. Slave 2 should listen to the same port 8020 instead of listening at 8021.

Upvotes: 0

Hari Menon
Hari Menon

Reputation: 35405

Check the namenode and datanode logs. (Should be in $HADOOP_HOME/logs/). Most likely issue could be that the namenode and datanode IDs don't match. Delete the hadoop.tmp.dir from all nodes and format the namenode ($HADOOP_HOME/bin/hadoop namenode -format) again, then try again.

Upvotes: 0

Related Questions