usb
usb

Reputation: 289

MultipleInputs not working - Hadoop 2.5.0

I'm trying to write a program that has 2 mappers that are executed simultaneously and one reducer. Each mapper has a different input file. Basically, I'm trying to do a reduce-side join. But I am getting errors when I declare my job the following way:

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 3) {
        System.err.println("Usage: MovieRatings <in1> <in2> <out>");
        System.exit(2);
    }

    Job job = new Job(conf, "movieratings");
    job.setJarByClass(MovieRatings.class);
    job.setMapperClass(MovieIDJoinMapper.class);
    job.setMapperClass(MovieNameJoinMapper.class);
    MultipleInputs.addInputPath(job, new Path("/temp2"), TextInputFormat.class, MovieIDJoinMapper.class);
    MultipleInputs.addInputPath(job, new Path(otherArgs[1]), TextInputFormat.class, MovieNameJoinMapper.class);
    job.setReducerClass(ReduceSideJoin.class);
    job.setNumReduceTasks(1);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[2]));

    System.exit(job.waitForCompletion(true) ? 0 : 1);

}

The error I can't get rid of is:

The method addInputPath(JobConf, Path, Class<? extends InputFormat>, Class<? extends Mapper>) in the type MultipleInputs is not applicable for the arguments (Job, Path, Class<TextInputFormat>, Class<MovieRatings.MovieIDJoinMapper>) MovieRatings.java   /homework2/src

Now I get that it should work if I do:

JobConf job = new JobConf();

But that doesn't work either. I am using Hadoop 2.5.0. I know this might be a problem due to mismatch between the version and the API but I've tried different ways and nothing seems to work. Can someone help me please? Thanks!

Upvotes: 1

Views: 1335

Answers (2)

B K
B K

Reputation: 743

I too had got the same error. Here the problem must be you might have used both mapred and mapreduce libraries at the same time.

Replace

import org.apache.hadoop.mapred.TextInputFormat

with

import org.apache.hadoop.mapreduce.lib.input.TextInputFormat

Upvotes: 0

blackSmith
blackSmith

Reputation: 3154

This is an API mismatch issue. You are using newer types, but somehow imported the old org.apache.hadoop.mapred.lib.MultipleInputs class. Change it to the following, and the errors should be gone :

 import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;

Upvotes: 1

Related Questions