Reputation: 60871
I am following Hadoop in Action to get started with hadoop with ec2. I'm running on ubuntu and have downloaded and installed the latest version of Hadoop. I am hitting a road block at this command:
hadoop-ec2 launch-cluster mycluster 2
The book says "The Hadoop EC2 tools are in the directory src/contrib/ec2/bin under your Hadoop installation. Recall that our ec2-init.sh script has already added that directory to your system PATH. Within that directory is hadoop-ec2, which is a meta-command for executing other commands. To launch a Hadoop Cluster on ec2 use:
hadoop-ec2 launch-cluster < cluster-name> < number-of-slaves>"
The response I get is: hadoop-ec2: command not found
I noticed that the variable $HADOOP_HOME
is not set.
It looks like this book is out-dated.
HADOOP_HOME
is deprecated. Is this true?ec2-describe-images
. and get all the available images that I can use. Why doesn't hadoop-ec2
command work?Thank you for your guidance.
Upvotes: 1
Views: 729
Reputation: 64761
Unfortunately the dedicated page Running Hadoop on Amazon EC2 (which doesn't facilitate HADOOP_HOME
indeed) turns out to be fairly out of date in itself and doesn't seem to apply to the most recent stable version anymore (1.0.4 at the time of this writing). I'm not aware of an updated 'native' tutorial, but apparently users are quite happy with an approach via Apache Whirr (which incidentally started out in 2007 as some bash scripts in Apache Hadoop for running Hadoop clusters on EC2).
Accordingly there is a Getting Started with Whirr™ available, in addition there are also related 3rd party tutorials, e.g.:
I hope you'll be able to merge the information in the book about using Apache Hadoop with these about running a Hadoop cluster via Apache Whirr - good luck!
Upvotes: 1