JR Galia
JR Galia

Reputation: 17269

Hadoop client.RMProxy: Connecting to ResourceManager

Hadoop client.RMProxy: Connecting to ResourceManager

I setup single-node cluster on linux: http://tecadmin.net/setup-hadoop-2-4-single-node-cluster-on-linux/

When I run mapreduce application like below: hadoop jar hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+

I got the ff INFO:
15/02/25 23:42:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/25 23:42:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:42:59 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/02/25 23:43:02 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

jps:

5232 SecondaryNameNode
6482 RunJar
5878 NodeManager
6521 Jps
4905 NameNode
5759 ResourceManager
5023 DataNode

How to connect to ResourceManager when setting up single-node cluster?

I tried to add to yarn-site.xml, but didn't work.

<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>

Thanks

Upvotes: 4

Views: 8690

Answers (3)

GeoffreyMahugu
GeoffreyMahugu

Reputation: 431

I had the same issue running a Hadoop instance on Kubernetes. The issue is in the error message itself "Connection error while attempting to connect to the ResourceManager".

Ps: ResourceManager listens on port 8032 (unless changed)

Make sure you are running the MapReduce job in the same network as the ResourceManager as it will be listening on this address:

http://<RESOURCE_MANAGER_IP>:8032

Upvotes: 0

AKs
AKs

Reputation: 1745

This issue might be due to the missing HADOOP_CONF_DIR which is needed by the MapReduce Application to connect to the Resource Manager which is mentioned in yarn-site.xml. So, before running the MapReduce job try to set/export HADOOP_CONF_DIR manually with appropriate Hadoop Conf directory like export HADOOP_CONF_DIR=/etc/hadoop/conf. This way worked for me :)

Upvotes: 0

many
many

Reputation: 1

Just remember one aspect about running Hadoop. Three modes are given: stand-alone, pseudo-distributed, and fully-distributed.

Stand-alone and pseudo distributed are run in the same node. Actually, they are run in your machine only. This does not need the configuration you have shown: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

A priori, this is all what you need for a single node in the yarn-site.xml:

<configuration>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

But further configurations can be also used. My yarn-site for pseudo-distributed mode is like:

<configuration>
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:8025</value>
</property>
<property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>localhost:8030</value>
</property>
<property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:8050</value>
</property>

Tip: be sure of the IP you are typing in the configs files. I suggest you to add this IP to your etc/hosts and provide a hostname. Thus, use the hostname in the config files.

Upvotes: 0

Related Questions