Reputation: 369
I'm trying to run a mapreduce code example on AWS. This is the link for the code sample https://github.com/ScaleUnlimited/wikipedia-ngrams
However, I'm pretty new for these things. In fact, they did write in the Readme file that I should build a job jar file from the code sample. But, still didn't get how could I build a job jar.
I'm following also these videos that explain how to run a job in EMR http://www.youtube.com/watch?v=cAZur5maWZE&list=PL080E1DEBCE5388F3
But they didn't tell also how to get this important jar file to start the work.
Any help
Upvotes: 1
Views: 1362
Reputation:
You can create the java files in eclipse, add hadoop to build path, then export it as a jar. See "6.1 Creating the Jar file" in this tutorial for details: Introduction to Amazon Web Services and MapReduce Jobs
And there are two ways to launch the job flow, through console or CLI, check the 6.2, 6.3 in the tutorial above.
Upvotes: 0
Reputation: 184
The same as for normal java program (http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html):
$ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java
$ jar -cvf /usr/joe/wordcount.jar -C wordcount_classes/ .
or if it is a maven project:
$ mvn clean package
or specific for https://github.com/ScaleUnlimited/wikipedia-ngrams (see README):
$ ant clean job
Upvotes: 2