Reputation: 357
I am new to Hadoop. I want to access a command line argument from main function(Java program) inside the map function of the mapper class. Please suggest ways to do this.
Upvotes: 20
Views: 24479
Reputation: 1837
In recent Hadoop (e.g. >=0.2 up to 2.4+) you would set this kind of options during the job configuration:
conf = new JobConf(MyJarClass);
conf.set("myStringOption", "myStringValue");
conf.set("myIntOption", 42);
And retrieve those options in the configure()
method ofmapper/reducer classes:
public static class MyMapper extends MapReduceBase implements Mapper<...> {
Integer myIntegerOption;
String myStringOption;
@Override
public void configure(JobConf job) {
super.configure(job);
myIntegerOption = job.getInt("myIntOption", -1);
// nb: last arg is the default value if option is not set
myStringOption = job.get("myStringOption", "notSet");
}
@Override
public void map(... key, ... value,
OutputCollector<..> output, Reporter reporter) throws IOException {
// here you can use the options in your processing
processRecord(key, value, myIntOption, myStringOption);
}
}
Note that configure()
will be called once before any records are passed to the map or reduce.
Upvotes: 5
Reputation: 33495
Hadoop 0.20, introduced new MR API, there is not much functionality difference between the new (o.a.h.mapreduce package) and old MR API (o.a.h.mapred) except that data can be pulled within the mappers and the reducers using the new API. What Arnon is mentioned is with the old API.
Check this article for passing the parameters using the new and old API.
Upvotes: 19
Reputation: 25909
You can pass parameters by hanging them on the Configuration
JobConf job = new JobConf(new Configuration(), TheJob.class);
job.setLong("Param Name",longValue)
The Configuration class has few set methods (Long, Int, Strings etc.) so you can pass parameters of several types. In the map job you can get the configuration from the Context (getConfiguration)
Upvotes: 15