Betta
Betta

Reputation: 416

hadoop mapreduce only one mapper is executed

I am running map reduce job. Whatsoever the size of the file (70 MB, 200 MB, 2.5 GB) only one mapper is run. The block size is 128MB.

Could anyone help to find out what could be the reason??

Note

  1. Data file is not zip/gzip file, it is *.dat
  2. This is not production environment. Any possibility that the user is a low priority user?? Ref#11 https://cloudcelebrity.wordpress.com/2013/08/14/12-key-steps-to-keep-your-hadoop-cluster-running-strong-and-performing-optimum/

.

My code for submitting the job is as below:

    String configPath = arg[0];
    String feedString = FileUtils.readFileToString(new File(configPath), StandardCharsets.UTF_8.name());
    getConf().set(Constants.FEED_CONFIG_STRING, feedString);
    getConf().set("mapred.reduce.tasks.speculative.execution", "false");

    Job job = new Job(conf);
    Feed feed = XMLFeedConfig.getFeed(feedString);
    job.setJarByClass(DataValidationJob.class);
    job.setJobName("Job " + feed.getName());

    ValidatorInputFormat.setInputPaths(job, new Path(feed.getSrc_location()));
    FileOutputFormat.setOutputPath(job, new Path(feed.getDest_location()));

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    job.setMapperClass(ValidatorMapper.class);
    job.setReducerClass(ValidatorReducer.class);
    LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
    job.setNumReduceTasks(1);

    job.setInputFormatClass(ValidatorInputFormat.class);
    // job.setOutputFormatClass(TextOutputFormat.class);

    return job.waitForCompletion(true) ? 0 : 1;

Upvotes: 0

Views: 935

Answers (1)

Betta
Betta

Reputation: 416

My issue has been resolved. Basically we had implemented FileInputFormat where in we had overridden isSplittable method and were making the input non splittable as shown below:

@Override
protected boolean isSplitable(JobContext context, Path filename) {
    return false;
}

by default isSplittable method is implemented to return true;

Upvotes: 1

Related Questions