KFed
KFed

Reputation: 84

hadoop-2.2.0 mapreduce not working on ubuntu

I've installed hadoop 2.2.0 on 64-bit Ubuntu 12.04.3 (precise) and configured the configuration xml files as suggested in a blog (http://tuliodomingos.blogspot.com.es/2013/04/installing-apache-hadoop-in-ubuntu-linux.html if you're interested)

The aim is to have a "single node cluster" for dfs and mapreduce.

Because some library is lacking, I get the following message often but I don't think it is causing the problems:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[I tried a build from maven but got super confused with what was actually going on. there seemed to be iteration after iteration of compilation and I had no Idea of what was going on.]

Anyway, with my downloaded (non-maven) hadoop, the distributed file system seems to behave itself. However, when I try to run WordCount mapreduce examples as per tutorials, I get stuck. The jobs are submitted ok, however they never seem to actually run. The attached "mr_output.txt" is what is returned in the terminal.

Also, looking at the local monitoring sites (sorry I can't post these images), one thing I notice is that these sites indicate zero active nodes and I don't understand what is going on, considering that dfs operations are all good.

Also, here is the output of hdfs dfsadmin -report:

13/11/06 14:08:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 412849389568 (384.50 GB)
Present Capacity: 134156435456 (124.94 GB)
DFS Remaining: 134152601600 (124.94 GB)
DFS Used: 3833856 (3.66 MB)
DFS Used%: 0.00%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost)
Hostname: rimmer-Inspiron-7520
Decommission Status : Normal
Configured Capacity: 412849389568 (384.50 GB)
DFS Used: 3833856 (3.66 MB)
Non DFS Used: 278692954112 (259.55 GB)
DFS Remaining: 134152601600 (124.94 GB)
DFS Used%: 0.00%
DFS Remaining%: 32.49%
Last contact: Wed Nov 06 14:08:18 EST 2013

If I try to invoke "yarn resoucemanager" or "yarn nodemanager" I get a mega long stream of messages, the error I can see is:

13/11/06 14:15:11 FATAL nodemanager.NodeManager: Error starting NodeManager
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

This is despite "yarn.nodemanager.aux-services" being set to "mapreduce.shuffle" within the file "yarn-site.xml"

I've gone through the official docs a bunch of times and also hit google and forums pretty hard. Any wisdom greatly appreciated.

Best,

Kieran

Upvotes: 2

Views: 3435

Answers (3)

mandar2812
mandar2812

Reputation: 21

Even after changing the value of "yarn.nodemanager.aux-services" to "mapreduce_shuffle", there would still be problems getting the namenode up.

It seems that Hadoop 2.2.0 was shipped to work out of the box on 32bit machines only, due to a folder structure change from 1.2.0 in which now the $HADOOP_INSTALL/lib directory has only one set of libraries (those which work on 32 bit systems only).

Earlier in 1.2.0, inside that libraries directory there were two sub directories called "Linux-amd64-64" and "Linux-i386-32" corresponding to both x32 and x64 architectures.

There is a discussion about it here :

https://issues.apache.org/jira/browse/HADOOP-9911

There also a page suggesting that you can compile from source on x64 over here:

http://blog.csdn.net/focusheart/article/details/14058153

P.S. I havent been able to compile it without errors though. The issue on the JIRA thread above is unresolved as well.

EDIT: And because of all the above, everything except the namenode is up and running, which is why you would see the nogemanager, resourcemanager,secondarynamenode (as far as I know it can't "replace" a namenode) and datanode up and running.

Upvotes: 2

Graham Lea
Graham Lea

Reputation: 6333

For some reason, the valid format for service names changed between Hadoop 2.1.0 and 2.2.0.

The correct value is now mapreduce_shuffle instead of mapreduce.shuffle

c.f. http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html

Upvotes: 4

ajsmith007
ajsmith007

Reputation: 106

Did you try to run just the standalone mode first?

Upvotes: 0

Related Questions