karthi
karthi

Reputation: 569

Failed to receive heartbeat from agent while installing cloudera on ubuntu 14.04

I'm getting the following error when trying to install cloudera on ubuntu 14.04.

Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
Ensure that ports 9000 and 9001 are not in use on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added. (Some of the logs can be found in the installation details).
If Use TLS Encryption for Agents is enabled in Cloudera Manager (Administration -> Settings -> Security), ensure that /etc/cloudera-scm-agent/config.ini has use_tls=1 on the host being added. Restart the corresponding agent and click the Retry link here.

And this is the logs created.

>>ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized> 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 5) 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent ERROR Failed! trying again in 1 second(s) 
>>Traceback (most recent call last): 
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 2161, in connect_to_new_supervisor 
>> self.get_supervisor_process_info() 
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 2183, in get_supervisor_process_info 
>> self.identifier = self.supervisor_client.supervisor.getIdentification() 
>> File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ 
>> return self.__send(self.__name, args) 
>> File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request 
>> verbose=self.__verbose 
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 470, in request 
>> '' ) 
>>ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized> 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent ERROR Failed to connect to newly launched supervisor. Agent will exit 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent INFO Stopping agent... 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent INFO No extant cgroups; unmounting any cgroup roots 
>>[30/Jun/2016 01:10:51 +0000] 20081 MainThread agent INFO Cleaning up daemon 
>>[30/Jun/2016 01:10:51 +0000] 20081 Dummy-1 agent INFO Stopping agent... 
>>[30/Jun/2016 01:10:51 +0000] 20081 Dummy-1 agent INFO No extant cgroups; unmounting any cgroup roots 
>>[30/Jun/2016 01:10:51 +0000] 20081 Dummy-1 agent ERROR Shutdown callback failed. 
>>Traceback (most recent call last): 
>> File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 2764, in stop 
>> f() 
>> File "/usr/lib/python2.7/asyncore.py", line 409, in close 
>> self.socket.close() 
>> File "/usr/lib/python2.7/asyncore.py", line 636, in close 
>> os.close(self.fd) 
>>OSError: [Errno 9] Bad file descriptor 
>>[30/Jun/2016 01:10:51 +0000] 20081 Dummy-1 agent INFO Cleaning up daemon 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO SCM Agent Version: 5.7.1 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Agent Protocol Version: 4 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Using Host ID: f7f7aaf2-8291-4659-a415-bdd18ca203c3 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Using directory: /run/cloudera-scm-agent 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Using supervisor binary path: /usr/lib/cmf/agent/build/env/bin/supervisord 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to enable certificate validation. 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Agent Logging Level: INFO 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO No command line vars 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Found database jar: /usr/share/java/mysql-connector-java.jar 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Agent starting as pid 20663 user root(0) group root(0). 
e-connector-java.jar (normal, if you're not using this database type) 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar 
>>[30/Jun/2016 01:12:58 +0000] 20663 MainThread agent INFO Agent starting as pid 20663 user root(0) group root(0). 
END (0) 
end of agent logs. 
scm agent started 

This is what i'm currently having in etc/hosts;

127.0.0.1   localhost.localdomain   localhost
127.0.1.1   humworld-Inc

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

I'm having the server_host value as localhost and server_port as 7182 inside /opt/cloudera-manager/cm-5.7.1/etc/cloudera-scm-agent/config.ini. Do i need to change anything, please help me with this, i'm completely out of ideas.

Upvotes: 1

Views: 3379

Answers (2)

Sujit Rai
Sujit Rai

Reputation: 455

netstat -tupnl |grep 19001  \\ default supervisord_port=19001

You will get result like this

tcp   0   0    127.0.0.1:19001   0.0.0.0:*    LISTEN   4833/python

4833 is Supervisor process id. Kill the process and restart the cloudera agent.

kill -9 4833
service cloudera-scm-agent restart

And check the agent log file if the issue still persists.

Upvotes: 7

raja
raja

Reputation: 51

ProtocolError: - I faced the same issue. You can try killing the supervisor from command line using kill command . The other fix which worked for me is to remove the cloudera-scm-agent packages in that particular host and install again.

Upvotes: 1

Related Questions