Reputation: 289
I'm trying to write a program that has 2 mappers that are executed simultaneously and one reducer. Each mapper has a different input file. Basically, I'm trying to do a reduce-side join. But I am getting errors when I declare my job the following way:
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length != 3) {
System.err.println("Usage: MovieRatings <in1> <in2> <out>");
System.exit(2);
}
Job job = new Job(conf, "movieratings");
job.setJarByClass(MovieRatings.class);
job.setMapperClass(MovieIDJoinMapper.class);
job.setMapperClass(MovieNameJoinMapper.class);
MultipleInputs.addInputPath(job, new Path("/temp2"), TextInputFormat.class, MovieIDJoinMapper.class);
MultipleInputs.addInputPath(job, new Path(otherArgs[1]), TextInputFormat.class, MovieNameJoinMapper.class);
job.setReducerClass(ReduceSideJoin.class);
job.setNumReduceTasks(1);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileOutputFormat.setOutputPath(job, new Path(otherArgs[2]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
The error I can't get rid of is:
The method addInputPath(JobConf, Path, Class<? extends InputFormat>, Class<? extends Mapper>) in the type MultipleInputs is not applicable for the arguments (Job, Path, Class<TextInputFormat>, Class<MovieRatings.MovieIDJoinMapper>) MovieRatings.java /homework2/src
Now I get that it should work if I do:
JobConf job = new JobConf();
But that doesn't work either. I am using Hadoop 2.5.0. I know this might be a problem due to mismatch between the version and the API but I've tried different ways and nothing seems to work. Can someone help me please? Thanks!
Upvotes: 1
Views: 1335
Reputation: 743
I too had got the same error. Here the problem must be you might have used both mapred and mapreduce libraries at the same time.
Replace
import org.apache.hadoop.mapred.TextInputFormat
with
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat
Upvotes: 0
Reputation: 3154
This is an API mismatch issue. You are using newer types, but somehow imported the old org.apache.hadoop.mapred.lib.MultipleInputs
class. Change it to the following, and the errors should be gone :
import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;
Upvotes: 1