DanaMihai
DanaMihai

Reputation: 61

Cloudera Manager installation failed to receive heartbeat from agent - to add new hosts to cluster

I try to install on Ubuntu 12.04.1 LTS the cloudera manager using standard version and when I want to add new host I get the next error:

Installation failed.Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accesible on the Cloudera Manager server (check firewall rules).
Ensure that ports 9000 an 9001 are free on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).

In the /etc/hosts file I have it configured as:

127.0.0.1 localhost
127.0.0.1 hadoop-ubuntu
192.168.5.xyz hadoop-ubuntu.dana.local hadoop-ubuntu
192.168.3.xyz ro-m81.dana.local ro-m81
192.168.3.abc ro-m41.dana.local ro-m41

The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters     
The **/var/log/cloudera-scm-agent/cloudera-scm-agent.log** shows the next error::   
[09/Oct/2013 16:04:23 +0000] 4532 MainThread agent ERROR Heartbeating to 192.168.5.xyz:7182 failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 747, in send_heartbeat
response = self.requestor.request('heartbeat', dict(request=heartbeat))
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 256, in issue_request
call_response = self.transceiver.transceive(call_request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 485, in transceive
result = self.read_framed_message()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 489, in read_framed_message
response = self.conn.getresponse()
File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib64/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib64/python2.6/socket.py", line 433, in readline
data = recv(1)
error: [Errno 104] Connection reset by peer

Please help me to find why I get this error or what I am missing.

Upvotes: 5

Views: 19478

Answers (5)

Spandana r
Spandana r

Reputation: 283

  1. First Check the Cloudera scm agent status whether it is running or not by using "sudo service cloudera-scm-agent status"

2.check the agent log files in this directory in /var/log/cloudera-scm-agent/

Resolution Resource: http://commandstech.com/what-is-heartbeat-in-hadoop-how-to-resolve-heartbeat-lost-in-cloudera-and-hortonworks/

Upvotes: 0

Gowtham Balusamy
Gowtham Balusamy

Reputation: 742

I Faced the same problem, then I found a solution.

I used two machines one for master and another one for slave

the master machine having the cloudera-scm-server.

I configured the /etc/hosts in both machines, finally the error gone.

Master Machine Ip is: 192.168.1.10

In Master Machine /etc/hosts

127.0.0.1       localhost

192.168.1.10     <hostname>

Slave Machine Ip is: 192.168.1.8

In Slave Machine /etc/hosts

127.0.0.1       localhost

192.168.1.8     <hostname>

Upvotes: 1

Matt
Matt

Reputation: 21

I had the same problem with you, and I fixed it finally.

The problem of me was the version of the agent's cloudera-scm-agentis different with the server's cloudera-scm-server, you could use dpkg or yum to check yourself.

Upvotes: 0

Jesuisme
Jesuisme

Reputation: 1911

After checking your host files on all the nodes in the cluster, make sure that you open ports 7180 and 7182 on the installer and port 9000 on the cluster nodes (other than the installer).

I was getting the "inspector failed. IO Exception thrown" error from the Cloudera install until I looked in the installer (server) logs and saw that the clients could not communicate on port 9000.

Upvotes: 0

vishnu viswanath
vishnu viswanath

Reputation: 3854

I had the same issue. This is what did the trick for me.

type ifconfig and find your ip address. not 127.0.0.1.

type $hostname and find your hostname

edit /etc/hosts file

add an entry for your ipaddress over there. something like

192.168.8.xxx   hostname.test.com   hostname

restart cloudera service. Go to sonic.test.com:7180 and try again. It should work. Even if didn't work, go to http://hostname.test.com:7180/cmf/home check the status of the hosts.

It turned out that, even though I was getting heartbeat error, the host was actually up and running.

Upvotes: 1

Related Questions