Submitting Hadoop jobs through Hadoop job client on the command line

Question

I have been trying to find info on how to submit hadoop jobs through the command line.

I am aware of the command - hadoop jar jar-file main-class input output

There is also another command about which I am trying to find info, but havent been able to - hadoop job -submit job-file

What is a "job-file" and how do I create one? What is the basic difference between command (a.) and (b.) ? Which is a better option?

Thanks in advance.

saurabh shashank · Accepted Answer

Here is an Example of Job-file for running the wordcount Map-reduce job . Similarly you can write job-file for your Map-Reduce jobs .

mapred.input.dir=data/file1.txt
mapred.output.dir=output
mapred.job.name=wordcount
mapred.mapper.class=edu.uci.ics.hyracks.examples.wordcount.WordCount$Map
mapred.combiner.class=edu.uci.ics.hyracks.examples.wordcount.WordCount$Reduce
mapred.reducer.class=edu.uci.ics.hyracks.examples.wordcount.WordCount$Reduce
mapred.input.format.class=org.apache.hadoop.mapred.TextInputFormat
mapred.output.format.class=org.apache.hadoop.mapred.TextOutputFormat
mapred.mapoutput.key.class=org.apache.hadoop.io.Text
mapred.mapoutput.value.class=org.apache.hadoop.io.IntWritable
mapred.output.key.class=org.apache.hadoop.io.Text
mapred.output.value.class=org.apache.hadoop.io.IntWritable

For me the "Hadoop Jar" is better coz , configuration done in job-file can be easily done in the program itself . Thanks

Submitting Hadoop jobs through Hadoop job client on the command line

Answers (1)

Related Questions