bunny
bunny

Reputation: 2037

command line compiling mapreduce jobs

everyone. I recently sucessfull y install the HDP2.0 for windows on my computer. It also passed the smoke-test example provided by HDP2.0. I am trying to compile my own mapreduce program via the command line. I used the command line: `

javac -classpath c:\hdp\hadoop-2.2.0.2.0.6.0-0009\hadoop-2.2.0.2.0.6.0-0009-core.jar wordcountclass WordCount.java

However, it doesn’t work. I found there is actually no hadoop-2.2.0.2.0.6.0-0009-core.jar under my c:\hdp\hadoop-2.2.0.2.0.6.0-0009 folder. I would like to know how to compile mapreduce program with the HDP2.0 for windows. I am not really sure which jar files I need to set as the classpath. Could you please help me, Thank you very much!!!

PS:The error message are all about cannot find symbol Mapper, Reducer and the MapReduce API's object.

Upvotes: 2

Views: 1204

Answers (1)

Jesse Hernandez
Jesse Hernandez

Reputation: 317

I currently use in eclipse successfully (see below for classpath): hadoop-common-* hadoop-hdfs-* hadoop-mapreduce-client* hadoop-mapreduce-client-jobclient* hive-jdbc* hive-metastore-* hive-service libfb303* libthrift* log4j slf4j-api* slf4j-log4j12*

Some of these are in different places, some are in the hadoop directory, hadoop-hdfs, hadoop-mapreduce, hadoop-yarn, hbase, hcatalog, hive.

I included all those locations that had jar files then trim it down from there. In Linux, i export like this:

export CLASSPATH=.:$CLASSPATH.::/usr/lib/hadoop/lib/native/:/usr/lib/hadoop/

But to answer your question, simply find the libraries above and that should work. Also, if you want to play with an already built system, try Cloudera Quickstart VM:

https://www.cloudera.com/content/support/en/downloads/download-components/download-products.html?productID=F6mO278Rvo

It already comes with hadoop installed and some sample eclipse code for MapReduce jobs.

There is no difference in Windows and Linux, just the way you setup the classpath. The libraries are the same.

EDIT:

By the way, i compile my code like this (create the mapstuff_classes directory first):

javac -cp $CLASSPATH -d mapstuff_classes MapStuff.java

Then create a jar file:

jar -cvf mapstuff.jar -C mapstuff_classes/ .

Then finally run it like this:

hadoop fs -mkdir input // creates a directory in hadoop
hadoop fs -copyFromLocal data.csv input // copies your data into hadoop

hadoop jar mapstuff.jar MapStuff input output // hadoop creates the output directory, took me a while to realize that. 

Upvotes: 1

Related Questions