Reputation: 479
I tried to use hadoop as a distributed mode, and I did setup but an error has occurred. I describe setup procedure below:
①Server constitution
Master server's hostname is master, and slave servers' names are node1 and node2. All server's OS is CentOS7. master's ip address is 131.113.101.103, slaves' ip address are 131.113.101.101 and 131.113.101.102.
②setting on each server
fixed /etc/hosts and /etc/hostname. I describe only master server. ○/etc/hostname
master
○/etc/hosts
131.113.101.101 node1
131.113.101.102 node2
131.113.101.103 master
Installed packages
sudo yum -y install epel-release
sudo yum -y install openssh-clients rsync wget java-1.8.0-openjdk-devel sshpass
Get hadoop
wget http://ftp.riken.jp/net/apache/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz
tar xf hadoop-2.8.1.tar.gz
Fixed .bashrc
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
export HADOOP_HOME=~/hadoop-2.8.1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH
Then I checked hadoop version
and it works.
③Setting on Master server
configurations of ssh without pass phrase
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
and send to node1 and node2, and changed name to authorized_keys. Also I accessed from master to node1 and node2 to check, I could access without pass phrase.
○/etc/hadoop/slaves
node1
node2
○/etc/hadoop/core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://131.113.101.103:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-username/</value>
</property>
○/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>131.113.101.103:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/name</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/data</value>
</property>
○/etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
○/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
I send these config files to node1 and node2.
④start hadoop
format HDFS
$HADOOP_HOME/bin/hdfs namenode -format
start daemon
$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver
then I use jps
command to check each server's processes.
Master server is
NameNode
Jps
ResourceManager
SecondaryNameNode
JobHistoryServer
node servers are
DataNode
Jps
NodeManager
Then I tried to use this command
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 10 10000
But these error codes has returned
Number of Maps = 10
Samples per Map = 10000
17/10/25 03:00:16 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/username/QuasiMonteCarlo_1508868015200_1006439027/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1337)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:440)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1733)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1536)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:658)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/username/QuasiMonteCarlo_1508868015200_1006439027/in/part0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1337)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:440)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1733)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1536)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:658)
I searched for the solution but there are no achievements.
----added----
the result of
bin/hadoop dfsadmin -report
is
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
It seems there are no active datanodes…
But on node1 and node2, it seems datanode process from an aspect of a jps result.
and checked /home/username/hadoop-2.8.1/logs/hadoop-username-datanode-node1.out
and home/username/hadoop-2.8.1/logs/hadoop-username-datanode-node2.out
Result is below:
○node1
ulimit -a for user username
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256944
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
○node2
ulimit -a for user username
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256944
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I also checked sudo netstat -ntlp
on master server, and the result and the jps
result are below:
○jps result
17252 JobHistoryServer
16950 ResourceManager
17418 Jps
16508 NameNode
16701 SecondaryNameNode
○sudo netstat -ntlp result
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 131.113.101.103:50090 0.0.0.0:* LISTEN 16701/java
tcp 0 0 0.0.0.0:19888 0.0.0.0:* LISTEN 17252/java
tcp 0 0 0.0.0.0:10033 0.0.0.0:* LISTEN 17252/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 16508/java
tcp 0 0 0.0.0.0:10020 0.0.0.0:* LISTEN 17252/java
tcp 0 0 131.113.101.103:9000 0.0.0.0:* LISTEN 16508/java
tcp6 0 0 131.113.101.103:8088 :::* LISTEN 16950/java
tcp6 0 0 131.113.101.103:8030 :::* LISTEN 16950/java
tcp6 0 0 131.113.101.103:8031 :::* LISTEN 16950/java
tcp6 0 0 131.113.101.103:8032 :::* LISTEN 16950/java
tcp6 0 0 131.113.101.103:8033 :::* LISTEN 16950/java
On node2 result is below:
○jps result
12228 NodeManager
12045 DataNode
12493 Jps
○sudo netstat -ntlp result
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:33742 0.0.0.0:* LISTEN 12045/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 12045/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 12045/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 12045/java
tcp6 0 0 :::8042 :::* LISTEN 12228/java
tcp6 0 0 :::13562 :::* LISTEN 12228/java
tcp6 0 0 :::8040 :::* LISTEN 12228/java
tcp6 0 0 :::42633 :::* LISTEN 12228/java
are there any wrong points?
I thought it strange that on node2 there are no local address "131.113.101.102".
Upvotes: 1
Views: 392
Reputation: 9102
Your error stack trace shows that the datanodes are not running. Check for data node startup logs for more info. Apart from this you can see if your problem is similar to what is here or here. Also try running the below command from name node. Although I am running hadoop on standalone - it should show similar info for you indicating the number of live datanodes.
bin/hadoop dfsadmin -report
It should give you info about the live nodes
Configured Capacity: 240611487744 (224.09 GB)
Present Capacity: 79048312831 (73.62 GB)
DFS Remaining: 79040917504 (73.61 GB)
DFS Used: 7395327 (7.05 MB)
DFS Used%: 0.01%
Under replicated blocks: 36
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (127.0.0.1)
Hostname: HSNMM-Shailendra.com
Decommission Status : Normal
Configured Capacity: 240611487744 (224.09 GB)
DFS Used: 7395327 (7.05 MB)
Non DFS Used: 161563174913 (150.47 GB)
DFS Remaining: 79040917504 (73.61 GB)
DFS Used%: 0.00%
DFS Remaining%: 32.85%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 24 23:39:47 IST 2017
Upvotes: 1