mrp
mrp

Reputation: 711

Installing, Configuring, and running Hadoop 2.2.0 on Mac OS X

I've installed hadoop 2.2.0, and set up everything (for a single node) based on this tutorial here: Hadoop YARN Installation. However, I can't get hadoop to run.

I think my problem is that I can't connect to my localhost, but I'm not really sure why. I've spent upwards of about 10 hours installing, googling, and hating open-source software installation guides, so I've now turned to the one place that has never failed me.

Since a picture is worth a thousand words, I give you my set up ... in many many words pictures:


Basic profile/setup


I'm running Mac OS X (Mavericks 10.9.5)

sharing

For whatever it's worth, here's my /etc/hosts file:

hosts

My bash profile:

bash


Hadoop file configurations


The setup for core-site.xml and hdfs-site.xml:

core-hdfs.xlm

note: I have created folders in the locations you see above

The setup for my yarn-site.xml:

yarn

Setup for my hadoop-env.sh file:

hadoop-env


Side Note


Before I show the results of when I run start-dfs.sh, start-yarn.sh, and check to see what's running with jps, keep in mind that I have a hadoop pointing to hadoop-2.2.0.

pointer


Starting up Hadoop


Now, here's the results of when I start the deamons up:

startitup

For those of you who don't have a microscope (it looks super small on the preview of this post), here's a code chunk of what shows above:

mrp:~ mrp$ start-dfs.sh
2014-11-08 13:06:05.695 java[17730:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:06:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-namenode-mrp.local.out
localhost: starting datanode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-datanode-mrp.local.out
localhost: 2014-11-08 13:06:10.954 java[17867:1403] Unable to load realm info from SCDynamicStore
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-secondarynamenode-mrp.local.out
0.0.0.0: 2014-11-08 13:06:16.065 java[17953:1403] Unable to load realm info from SCDynamicStore
2014-11-08 13:06:20.982 java[17993:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:06:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

mrp:~ mrp$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-mrp-resourcemanager-mrp.local.out
2014-11-08 13:06:43.765 java[18053:20b] Unable to load realm info from SCDynamicStore
localhost: starting nodemanager, logging to /usr/local/hadoop-2.2.0/logs/yarn-mrp-nodemanager-mrp.local.out

Check to see what's running:

jps


Time Out


OK. So far, I think, so good. At least this looks good based on all the other tutorials and posts. I think.

Before I try to do anything fancy, I'll just want to see if it's working properly, and run a simple command like hadoop fs -ls.


Failure


When I run hadoop fs -ls, here's what I get:

fail

Again, in case you can't see that pic, it says:

2014-11-08 13:23:45.772 java[18326:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:23:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Call From mrp.local/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

I've tried to run other commands, and I get the same basic error in the beginning of everything:

Call From mrp.local/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

Now, I've gone to that website mentioned, but honestly, everything in that link means nothing to me. I don't get what I should do.

I would very much appreciate any assistance with this. You'll make me the happiest hadooper, ever.

...this should go without saying, but obviously I'd be happy to edit/update with more info if needed. Thanks!

Upvotes: 2

Views: 1293

Answers (4)

Doug Donohoe
Doug Donohoe

Reputation: 417

Since native library isn't supported on Mac, if you want to suppress this warning:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Add this to the log4j.properties in ${HADOOP_HOME}/libexec/etc/hadoop:

# Turn of native library warning
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR

Upvotes: 0

mrp
mrp

Reputation: 711

So I've got Hadoop up and running. I had two problems (I think).

  1. When starting up the NameNode and DataNode, I received the following error: Unable to load realm info from SCDynamicStore.

To fix this, I added the following two lines to my hadoop-env.sh file:

HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.realm= -Djava.security.krb5.kdc=" HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.conf=/dev/null"

I found those two lines in the solution to this post, Hadoop on OSX "Unable to load realm info from SCDynamicStore". The Answer was posted by Matthew L Daniel.

  1. I had formatted the NameNode folder more than once, which apparently screws things up?

I can't verify this screws things up, because I don't have any errors in any of my log files, however once I followed Workaround 1 (deleting & recreating NameNode/DataNode folders, then reformatting) on this post, No data nodes are started, I was able to load up the DataNode and get everything working.

Upvotes: 0

Klaus Thul
Klaus Thul

Reputation: 685

Had a very similar problem and found this question while googling for a solution.

Here is how I could resolve it (on Mac OS 10.10 with Hadoop 2.5.1). Not sure if the question is exactly the same problem: I checked the log files generated by the data-node (/usr/local/hadoop-2.2.0/logs/hadoop-mrp-datanode-mrp.local.out) and found the following entry:

2014-11-09 17:44:35,238 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode:
Exception in namenode join org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
Directory /private/tmp/hadoop-kthul/dfs/name is in an inconsistent state: storage
directory does not exist or is not accessible.

Based on this, I concluded that something is wrong with the HDFS data on the datanode.

I deleted the directory with the HDFS data and reformatted HDFS:

rm -rf /private/tmp/hadoop-kthul
hdfs namenode -format

Now, I am up and running again. Still wondering if /private/tmp is a good place to keep the HDSF data - looking options to change this.

Upvotes: 1

Milad Qasemi
Milad Qasemi

Reputation: 3059

add these to .bashrc

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

Upvotes: 1

Related Questions