Scyla101
Scyla101

Reputation: 242

Hadoop Yarn Job Tracker not starting

Preface:

I have a problem with a webapp that was developed by a employee that is no longer with the company. Since there is next to no documentation about the implementation available I'm not sure where to get more input on the problem hence the question.

I try to find a solution to a problem that is similar to this question (Hadoop pseudo distributed mode - Datanode and tasktracker not starting). However since I have little experience with hadoop I can't determine what I need to fix to get the application working.

The Scenario:

The application is split into two parts:

  1. There is a Tomcat server where the application runs on and that handles the user input and provides the results of the map reduce job (localhost:8080/WebApp).
  2. Than there is the main Hadoop node at localhost:50070

    2.1. There is also a job tracker running at localhost:8088/cluster

All the applications are running on the same Ubuntu machine.

This is running correctly with the configuration that the former employee deployed. All I have as documentation of how to start the different servers. In case of the Hadoop cluster it is a script called up.sh.

What I did so far:

In a next step I want to migrate the application to a new network with static IP-addresses. I configured a connection with the data from the IT department and changed the server.xml file of the Tomcat server so that the application is available through the new static IP-address (172.16.254.1:8080/WebApp). This is working

The next step I took was changing the configuration of the /etc/hosts file where the old IP-address was listed as the master for the hadoop cluster.

So I changed this:

127.0.0.1   localhost
192.0.2.42  master

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

To this:

127.0.0.1    localhost
172.16.254.1 master

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

The results:

With the changes in place I can start the main Hadoop node with the up.sh script and access it at localhost:50070. However I cannot access the Hadoop job tracker at localhost:8088/cluster.

Inside the WebApp I can schedule map reduce tasks but the result is not correct. There is vital data missing that should be calculated by the map reduce cluster.

The only indication of an error I found so far is the following error message inside the hadoop-hduser-namenode.log file:

2015-07-28 13:57:23,713 ERROR org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error getting localhost name. Using 'localhost'...
java.net.UnknownHostException: ubuntu-machine: ubuntu-machine
    at java.net.InetAddress.getLocalHost(InetAddress.java:1461)
    at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.getHostname(MetricsSystemImpl.java:514)
    at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSystem(MetricsSystemImpl.java:453)
    at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:449)
    at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:180)
    at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:156)
    at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54)
    at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1253)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)

Since the call stack doesn't mentioned any classes that the employee developed I assume the problem lies inside the Hadoop/network configuration. Also the logs of the application server don't list any errors. I'm not sure what part I'm missing.

If you need more information about the content of the configuration files let me know and I will provide you with the information.

Upvotes: 0

Views: 839

Answers (1)

Amal G Jose
Amal G Jose

Reputation: 2546

The problem is with the hostname. Add the ubuntu-machine mapping to the /etc/hosts file

127.0.0.1    localhost ubuntu-machine
172.16.254.1 master ubuntu-machine

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Then another thing is check the version of hadoop. Hadoop has undergone a major change recently. So there are two major versions of hadoop are available hadoop 1.x and hadoop 2.x. The change was happened to the processing layer. In hadoop 1.x, we have Jobtracker and Tasktracker In hadoop 2.x, we have Resource manager, Nodemanager and Application Master. The installation steps for both the versions are different. Type hadoop version in the commandline and verify the version of hadoop that you are using.

If it is 1.x, then the job tracker web UI will be in the url http://jobrackerhost:50030.

If it is 2.x, then the resource manager web UI will be in the url http://resourcemanagerhost:8088

For starting the existing services, you don't have to worry much. First fix the hostname issue and try starting the services. If you have some important data stored in the cluster, don't format the cluster. If you are formatting the cluster, clear the datanode directories also. The command for starting every hadoop services in one go is given below.

Go to HADOOP_HOME/bin

cd $HADOOP_HOME/bin
./start-all.sh

Upvotes: 1

Related Questions