Reputation: 33
We have been trying to install hadoop cluster these days. Sometimes succeed but most times failed. I make the configuration according to official document and some seems-high-quality blogs.
The problem I meet is:
All the process (including namenode,datanode, nodemanager,resourcemanager) can be seen by the command:jps
But the slaves are actually not working. I cannot see them in web interface master:8088 or master:50070
Someone said that is caused by repeated namenode format and the id conflict. I think that is not to my problem since the datanodes don't work from the very beginning and the datanode folder are always empty.
Is there any other possible causes to this phenomenon? I am really suffering to setup the cluster.
Detail:
Hadoop version:3.0.0-alpha2
Output of hdfs dfsadmin -report is :
Configured Capacity: 492017770496 (458.23 GB)
Present Capacity: 461047037952 (429.38 GB)
DFS Remaining: 460770820096 (429.13 GB)
DFS Used: 276217856 (263.42 MB)
DFS Used%: 0.06%
Under replicated blocks: 10069
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:9866 (localhost)
Hostname: sr145.local.lan
Decommission Status : Normal
Configured Capacity: 492017770496 (458.23 GB)
DFS Used: 276217856 (263.42 MB)
Non DFS Used: 5954019328 (5.55 GB)
DFS Remaining: 460770820096 (429.13 GB)
DFS Used%: 0.06%
DFS Remaining%: 93.65%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Jan 02 02:52:57 CST 2000
**The only one live datanode is the same node of master node.**All the other slaves node are not live.
Configuration detail: 1,hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>sr145:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/hdfs/datanode</value>
</property>
</configuration>
2,core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop/tmp</value>
<description>Abasefor other temporary directories.</description>
</property>
</configuration>
3,yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.manager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>sr145</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>sr145:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>sr145:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>sr145:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>sr145:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>sr145:8088</value>
</property>
</configuration>
The configuration at all nodes are almost the same, only differs in the hdfs-site.xml natenode or datanode configuration.
The workers
and slaves
files in $HADOOP_HOME/etc/hadoop are also edited. No files are moved compared to the raw status.
Upvotes: 3
Views: 4035
Reputation: 18270
The only one live datanode is the same node of master node.
Only that datanode is aware that namenode is bind to localhost
, all the other datanodes are trying to connect with sr145
.
The host value set in fs.defaultFS
is where the Namenode daemon will be started.
Setting it to localhost
in the master node made the process to bind to the loopback IP of the node.
Edit the value to have the actual hostname or IP address, in this case it will be
<property>
<name>fs.defaultFS</name>
<value>hdfs://sr145:9000</value>
</property>
This property must be identical in all the nodes.
Upvotes: 2