Reputation: 1490
Hi i'm new to Hadoop and just started learning a couple days ago. I just followed the instructions from Digital Ocean to setup a Hadoop cluster. Afterwards I just tried a simple example program called WordCount from the Hadoop docs.
My hadoop version is 2.5.1 which is the same version with what is used on the tutorial, and it's running on Ubuntu Precise. I'm ensuring that I've done the proper setup as the tutorial said. Here's the end of my ~/.bashrc contents.
...
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_PREFIX=/usr/local/hadoop
#HADOOP VARIABLES END
Also, I checked on my java home config and the result is like below
sudo update-alternatives --config java
There are 3 choices for the alternative java (providing /usr/bin/java).
Selection Path Priority Status
------------------------------------------------------------
0 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 auto mode
1 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 manual mode
2 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java 1051 manual mode
* 3 /usr/lib/jvm/java-7-oracle/jre/bin/java 1 manual mode
So i changed all JAVA_HOME value both in bashrc and hadoop-env.sh files to /usr/lib/jvm/java-7-oracle
. I'm ensuring as well that the Dfs and Yarn are both started.
However, when i compile the WordCount.java using this command
hadoop com.sun.tools.javac.Main WordCount.java
Nothing is going my way. I got this error. Note that i'm using Hadoop command instead bin/hadoop as the command is working properly since it was defined in bashrc
file.
Error: Could not find or load main class com.sun.tools.javac.Main
What is the possible cause of this error and how to get rid of this? It might be java classpath issue i think, but i'm still not be able to figure out the detail. Every workarounds regarding this problem i got are about executing java
or javac
command, not hadoop
command.
I just want to get the sample program working first, before getting started to learn how it works. Any help would be appreciated. Thanks..
Upvotes: 5
Views: 7668
Reputation: 1
Hadoop work with both openjdk and oracle jdk, but you are using oracle jdk. I had same problem so I did following things.
1)export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
2)export PATH=${JAVA_HOME}/bin:${PATH}
3)export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
After running this command in terminal you will be able to compile java file. Hadoop is not able to find correct java path, that is why you are getting that error.
Upvotes: 0
Reputation:
The Apache Hadoop tutorial assumes that the environmental variables are set as follows:
export JAVA_HOME=/usr/java/default
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
Perhaps the Digital Ocean hadoop tutorial, which I also followed, ought to recommend adding those two latter variables to the ~/.bashrc so that it ends up looking like this:
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
It worked for my installation. See the new compiled class files listed in the output:
Before:
ubuntu@mail:/usr/local/hadoop$ ls
bin include lib LICENSE.txt NOTICE.txt README.txt share WordCount.java
etc input libexec logs output sbin WordCount_classes
After:
ubuntu@mail:/usr/local/hadoop$ bin/hadoop com.sun.tools.javac.Main WordCount.java
ubuntu@mail:/usr/local/hadoop$ ls
bin input LICENSE.txt output share WordCount$IntSumReducer.class
etc lib logs README.txt WordCount.class WordCount.java
include libexec NOTICE.txt sbin WordCount_classes WordCount$TokenizerMapper.class
Another helpful resource was this:
http://ubuntuforums.org/archive/index.php/t-634996.html Append the following lines to the opened .bashrc file, save it and close: export JAVA_HOME="/usr/lib/jvm/java-6-sun-1.6.0.22" export PATH=$PATH:$JAVA_HOME/bin issue the following command in the terminal: source $HOME/.bashrc
please refer this blog post for more info (http://sureshatt.blogspot.com/2011/01/easiest-way-to-install-java-in-ubuntu.html)
Upvotes: 3
Reputation: 2431
Try to set HADOOP_CLASSPATH
environment variable
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar:<path to hadoop libs>
Upvotes: 0
Reputation: 4010
Hadoop requires jdk path for JAVA_HOME. Make sure that you have set jdk path and not jre. It seems you have installed java manually. Check version for javac to ensure it is enabled.
javac -version
Check similar answer.
Upvotes: 2
Reputation: 1674
I Think Java is not enabled correctly. So please go to the hadoop-env.sh file and enable the java. Also check the java version and jre version. Both must be of the same version.
Upvotes: 2
Reputation: 3407
Assuming you are using Eclipse or any other IDE
,
As i mentioned in this post, create a maven
based simple wordcount project with your class.
So all dependencies will be handled. Next Right click
on your project and select Export
option and give a name <hadoop-example>.jar
and then next, next generate a jar file for your wordcount project.
You don't need to explicitly compile your programs, Eclipse will do it for you once export is successful
If you installed hadoop on the same machine, then start all the daemons
and check using jps
whether all daemons started or not.
Else copy the jar file to Virtual Machine where hadoop is installed.
Go to the jar location and run the following command.
hadoop jar <hadoop-example>.jar <fully qualified main class name> hdfsInputPath hdfsOutputPath
This will run your main class (WordCount
in your case)
One should use the above command to run any hadoop program using command line by using jar file and then main class name(WorkCount
) in your case.
You can make the WordCount
as application entry point Main class while exporting the jar file. So you don't need to give the fully qualifed name. it would be like this.
hadoop jar <hadoop-example>.jar hdfsInputPath hdfsOutputPath
Please try this and let us know if it helps you.
Update: As mentioned in the comments, IDE is not using.
The above paths you set up is used by
Map/Reduce
while running the program.
But before you run you should make a jar which needs all the dependents to compile.
So take one variable as HADOOP_CLASSPATH_LIBS
and assign all the jars in the
<installed-hadoop>/share/hadoop/<subfolder>/lib
.
For every jar, you should give the absolute path and export this variable
.
Which will enable you to compile and then make a jar file. Once you are having jar, you can follow the above steps to run it.
If need more help, can assist.
Hope it helps.
Upvotes: 1
Reputation: 4372
Try executing from hadoop directory
cd YARN_HOME bin/hadoop jar
absolute path to jar file
WordCountinput path
output path in hdfs
Check out the below link http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v2.0
Upvotes: 2