Dhoha
Dhoha

Reputation: 369

how to build a job jar for hadoop Mapreduce job in AWS

I'm trying to run a mapreduce code example on AWS. This is the link for the code sample https://github.com/ScaleUnlimited/wikipedia-ngrams

However, I'm pretty new for these things. In fact, they did write in the Readme file that I should build a job jar file from the code sample. But, still didn't get how could I build a job jar.

I'm following also these videos that explain how to run a job in EMR http://www.youtube.com/watch?v=cAZur5maWZE&list=PL080E1DEBCE5388F3

But they didn't tell also how to get this important jar file to start the work.

Any help

Upvotes: 1

Views: 1362

Answers (2)

user1914527
user1914527

Reputation:

You can create the java files in eclipse, add hadoop to build path, then export it as a jar. See "6.1 Creating the Jar file" in this tutorial for details: Introduction to Amazon Web Services and MapReduce Jobs

And there are two ways to launch the job flow, through console or CLI, check the 6.2, 6.3 in the tutorial above.

Upvotes: 0

Nikita Matyukov
Nikita Matyukov

Reputation: 184

The same as for normal java program (http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html):

$ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java 
$ jar -cvf /usr/joe/wordcount.jar -C wordcount_classes/ .

or if it is a maven project:

$ mvn clean package

or specific for https://github.com/ScaleUnlimited/wikipedia-ngrams (see README):

$ ant clean job

Upvotes: 2

Related Questions