Reputation: 999
I just set up a new Ubuntu 12.04 VM (Virtualbox) and wanted to test Hadoop on it. I am following this guide: http://hadoop.apache.org/docs/r0.20.2/quickstart.html
I think I am doing something wrong with the java installation and the JAVA_HOME path... Right now bin/hadoop always just returns "command not found"
Where do I have to extract the hadoop folder?
Do I need to set up SSH before? What about SSHD?
What are the commands to install the correct java version?
What EXACTLY do I have to enter into the hadoop-env.sh file?
Thanks!
Upvotes: 0
Views: 2140
Reputation: 1264
Installing Hadoop Hive Scoop and PIG
Follow the steps to install the above applications. Note : There is no need of extra user, you may work on existing system.
Download Haddop 2.7.1, PIG, Sqoop, Hive
http://www.us.apache.org/dist/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
http://www.us.apache.org/dist/pig/pig-0.13.0/pig-0.13.0.tar.gz
http://www.us.apache.org/dist/sqoop/1.4.6/ sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
http://www.eu.apache.org/dist/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
Extract in a folder say /home/mypc/hadoop-soft --> cd hadoop-soft
hive --> /home/mypc/hadoop-soft/hive
sqoop --> /home/mypc/hadoop-soft/sqoop
pig --> /home/mypc/hadoop-soft/pig
hadoop --> /home/mypc/hadoop-soft/hadoop
Make Sure you do not create any subfolder in these folder and are able to see respective bin folder.
Now Lets move these folders to /usr/lib
sudo mkdir /usr/lib/hadoop
sudo mv sqoop/ /usr/lib/hadoop/
sudo mv pig/ /usr/lib/hadoop/
sudo mv hive/ /usr/lib/hadoop/
sudo mv hadoop-2.6/ /usr/lib/hadoop/
Edit .bashrc File to add Path : Add the Following line at the end of file
Remove Java_path Statment ,if any as we are updating it here.
Check if Java is installed and is present at the location mentioned below. If yes then fine, if not then you need to google install java n ubuntu
sudo gedit ~/.bashrc
Add following lines in the end to .bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/lib/hadoop/hadoop-2.6
export HIVE_HOME=/usr/lib/hadoop/hive
export PIG_HOME=/usr/lib/hadoop/pig
export SQOOP_HOME=/usr/lib/hadoop/sqoop
export HADOOP_MAPRED_HOME=/usr/lib/hadoop/hadoop
export HADOOP_COMMON_HOME=/usr/lib/hadoop/hadoop
export HADOOP_HDFS_HOME=/usr/lib/hadoop/hadoop
export HADOOP_YARN_HOME=/usr/lib/hadoop/hadoop
export HADOOP_CONF_DIR=/usr/lib/hadoop/hadoop/etc/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin::$PIG_HOME/bin:$HIVE_HOME/bin:$SQOOP_HOME/bin
Save this file and close it. Now you may wanna run it so that updates are reflected.
source ~/.bashrc
6.Create two dirctories namenode and datanode
cd /usr/lib
sudo mkdir hdfs
cd hdfs
sudo mkdir namenode
sudo mkdir datanode
sudo chmod 777 -R namenode
sudo chmod 777 -R datanode
Go to $HADOOP_HOME and edit some files.
cd $HADOOP_HOME
cd etc/hadoop/
A. sudo gedit yarn-site.xml : Add these lines inside < configuration> < /configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
B. sudo gedit core-site.xml : Add these lines inside < configuration> < /configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
C. sudo gedit hdfs-site.xml : Add these lines inside <~configuration> <~/configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/lib/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/lib/hdfs/datanode</value>
</property>
D. sudo gedit mapred-site.xml :Add these lines
<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Note : This will be a new file. - Save it and Close.
Format namenode
hdfs namenode -format
Go to /usr/lib/hdfs and create start and stop scripts
cd /usr/lib/hdfs
sudo mkdir scripts
sudo chmod 777 -R scripts
cd scripts
sudo gedit hadoopstart.sh
Write these lines
/usr/lib/hadoop/hadoop-2.6/sbin/hadoop-daemon.sh start namenode
/usr/lib/hadoop/hadoop-2.6/sbin/hadoop-daemon.sh start datanode
/usr/lib/hadoop/hadoop-2.6/sbin/yarn-daemon.sh start resourcemanager
/usr/lib/hadoop/hadoop-2.6/sbin/yarn-daemon.sh start nodemanager
/usr/lib/hadoop/hadoop-2.6/sbin/mr-jobhistory-daemon.sh start historyserver
Save it and close.
sudo gedit hadoopstop.sh
Write these lines
/usr/lib/hadoop/hadoop-2.6/sbin/hadoop-daemon.sh stop namenode
/usr/lib/hadoop/hadoop-2.6/sbin/hadoop-daemon.sh stop datanode
/usr/lib/hadoop/hadoop-2.6/sbin/yarn-daemon.sh stop resourcemanager
/usr/lib/hadoop/hadoop-2.6/sbin/yarn-daemon.sh stop nodemanager
/usr/lib/hadoop/hadoop-2.6/sbin/mr-jobhistory-daemon.sh stop historyserver
-Save it and close it.
To start
sh /usr/lib/hdfs/scripts/hadoopstart.sh
To stop
sh /usr/lib/hdfs/scripts/hadoopstop.sh
Check if hadoop is running : After running start script
hadoop version
hadoopp fs -ls /
Open http://localhost:50070 to see if name node is running.
Run Various Serives using : On Terminal
Pig
sqoop
hive
Upvotes: 0
Reputation: 121
I used this great tutorial. Only change was that I installed a default Java6...
Michael Noll Tutorial for setting up Hadoop
Upvotes: 2
Reputation: 18460
The "command not found" error when running hadoop should not be related to JAVA_HOME. I believe you are not running this command from hadoop home directory (other alternative is to add the full path to hadoop/bin to your PATH).
You can extract hadoop folder anywhere you like
For hadoop-env.sh, you should set the JAVA_HOME variable to point to your Java installation home directory e.g. export JAVA_HOME=/home/jdk1.6.0/
change this path to reflect your environement
You will need SSH and SSHD especially if you will run Hadoop in distributed or pseudo-distributed environment.
Hadoop require Java 1.6+, just download jdk-7u9-linux-i586.tar.gz from here and follow the installation guide (it should not require more than just unzipping it)
Upvotes: 1