Reputation: 17269
Hadoop client.RMProxy: Connecting to ResourceManager
I setup single-node cluster on linux: http://tecadmin.net/setup-hadoop-2-4-single-node-cluster-on-linux/
When I run mapreduce application like below: hadoop jar hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+
I got the ff INFO:
15/02/25 23:42:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/25 23:42:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:42:59 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:43:02 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
jps:
5232 SecondaryNameNode
6482 RunJar
5878 NodeManager
6521 Jps
4905 NameNode
5759 ResourceManager
5023 DataNode
How to connect to ResourceManager when setting up single-node cluster?
I tried to add to yarn-site.xml
, but didn't work.
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
Thanks
Upvotes: 4
Views: 8690
Reputation: 431
I had the same issue running a Hadoop instance on Kubernetes. The issue is in the error message itself "Connection error while attempting to connect to the ResourceManager".
Ps: ResourceManager listens on port 8032 (unless changed)
Make sure you are running the MapReduce job in the same network as the ResourceManager as it will be listening on this address:
http://<RESOURCE_MANAGER_IP>:8032
Upvotes: 0
Reputation: 1745
This issue might be due to the missing HADOOP_CONF_DIR which is needed by the MapReduce Application to connect to the Resource Manager which is mentioned in yarn-site.xml. So, before running the MapReduce job try to set/export HADOOP_CONF_DIR manually with appropriate Hadoop Conf directory like export HADOOP_CONF_DIR=/etc/hadoop/conf. This way worked for me :)
Upvotes: 0
Reputation: 1
Just remember one aspect about running Hadoop. Three modes are given: stand-alone, pseudo-distributed, and fully-distributed.
Stand-alone and pseudo distributed are run in the same node. Actually, they are run in your machine only. This does not need the configuration you have shown: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
A priori, this is all what you need for a single node in the yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
But further configurations can be also used. My yarn-site for pseudo-distributed mode is like:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8050</value>
</property>
Tip: be sure of the IP you are typing in the configs files. I suggest you to add this IP to your etc/hosts and provide a hostname. Thus, use the hostname in the config files.
Upvotes: 0