Reputation: 1200
I'm trying to run the wordcount example on a cluster set up on AWS. It hangs and just says running job.
I found this error in the resourcemanager log
I can view all of my nodes via the HDFS UI (namenode:50070).
However when I try to view more info about the cluster via namenode:8088/cluster/nodes it says there are 0 nodes?
Any ideas? I've tried editing the yarn-site.xml to specify min/max memory and cores but it didn't work.
** edit Here are the errors from NodeManager log file
2018-02-08 19:28:41,110 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8042
2018-02-08 19:28:41,111 INFO org.mortbay.log: jetty-6.1.26
2018-02-08 19:28:41,246 INFO org.mortbay.log: Extract jar:file:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.9.0.jar!/webapps/node to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
2018-02-08 19:28:42,777 INFO org.mortbay.log: Started [email protected]:8042
2018-02-08 19:28:42,777 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app node started at 8042
2018-02-08 19:28:42,783 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Node ID assigned is : ec2-34-227-117-73.compute-1.amazonaws.com:39885
2018-02-08 19:28:42,797 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8031
2018-02-08 19:28:42,798 INFO org.apache.hadoop.util.JvmPauseMonitor: Starting JVM pause monitor
2018-02-08 19:28:42,861 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2018-02-08 19:28:42,866 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
2018-02-08 19:28:43,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-08 19:28:44,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-08 19:28:45,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-08 19:28:46,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-08 19:28:47,938 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-08 19:28:48,939 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
Upvotes: 1
Views: 2163
Reputation: 5967
You're making a common mistake in your understanding of Hadoop. Hadoop consists of a filesystem (HDFS) and a compute engine (YARN). Datanodes only show HDFS capability. To run jobs you need the Resource Manager and you also need Node Managers to provide compute capability.
Your screen shot of the Resource Manager bears this out. You have no Node Managers running therefore you have no vcores or memory to compute with.
Upvotes: 1