RAVITEJA SATYAVADA
RAVITEJA SATYAVADA

Reputation: 2571

Unable to run MR on cluster

I have an Map reduce program that is running successfully in standalone(Ecllipse) mode but while trying to run the same MR by exporting the jar in cluster. It is showing null pointer exception like this,

  13/06/26 05:46:22 ERROR mypackage.HHDriver: Error while configuring run method. 
  java.lang.NullPointerException

I used the following code for run method.

public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
    Configuration configuration = new Configuration();
    Tool headOfHouseHold = new HHDriver();

    try {
        ToolRunner.run(configuration,headOfHouseHold,args);
    } catch (Exception exception) {
        exception.printStackTrace();
        LOGGER.error("Error while configuring run method", exception);
        // System.exit(1);
    }
}

run method:

if (args != null && args.length == 8) {
    // Setting the Configurations
    GenericOptionsParser genericOptionsParser=new GenericOptionsParser(args);
    Configuration configuration=genericOptionsParser.getConfiguration();

    //Configuration configuration = new Configuration();

    configuration.set("fs.default.name", args[0]);
    configuration.set("mapred.job.tracker", args[1]);
    configuration.set("deltaFlag",args[2]);                                   
    configuration.set("keyPrefix",args[3]);
    configuration.set("outfileName",args[4]);
    configuration.set("Inpath",args[5]);
    String outputPath=args[6];

    configuration.set("mapred.map.tasks.speculative.execution", "false");
    configuration.set("mapred.reduce.tasks.speculative.execution", "false");

    // To avoid the creation of _LOG and _SUCCESS files
    configuration.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
    configuration.set("hadoop.job.history.user.location", "none");
    configuration.set(Constants.MAX_NUM_REDUCERS,args[7]);

    // Configuration of the MR-Job
    Job job = new Job(configuration, "HH Job");
    job.setJarByClass(HHDriver.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    job.setNumReduceTasks(HouseHoldingHelper.numReducer(configuration));
    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    MultipleOutputs.addNamedOutput(job,configuration.get("outfileName"),
                                   TextOutputFormat.class,Text.class,Text.class);

    // Deletion of OutputTemp folder (if exists)
    FileSystem fileSystem = FileSystem.get(configuration);
    Path path = new Path(outputPath);

    if (path != null /*&& path.depth() >= 5*/) {
        fileSystem.delete(path, true);
    }

    // Deletion of empty files in the output (if exists)
    FileStatus[] fileStatus = fileSystem.listStatus(new Path(outputPath));
    for (FileStatus file : fileStatus) {
        if (file.getLen() == 0) {
            fileSystem.delete(file.getPath(), true);
        }
     }
    // Setting the Input/Output paths
    FileInputFormat.setInputPaths(job, new Path(configuration.get("Inpath")));
    FileOutputFormat.setOutputPath(job, new Path(outputPath));

    job.setMapperClass(HHMapper.class);
    job.setReducerClass(HHReducer.class);

    job.waitForCompletion(true);

    return job.waitForCompletion(true) ? 0 : 1;

I double checked the run method parameters those are not null and it is running in standalone mode as well..

Upvotes: 0

Views: 92

Answers (1)

Jit B
Jit B

Reputation: 1246

Issue could be because the hadoop configuration is not properly getting passed to your program. You can try putting this in the beginning of your driver class:

GenericOptionsParser genericOptionsParser=new GenericOptionsParser(args[]);
Configuration hadoopConfiguration=genericOptionsParser.getConfiguration();

Then use the hadoopConfiguration object when initializing objects.

e.g.

public int run(String[] args) throws Exception {        
    GenericOptionsParser genericOptionsParser=new GenericOptionsParser(args[]);
    Configuration hadoopConfiguration=genericOptionsParser.getConfiguration();

    Job job = new Job(hadoopConfiguration);
    //set other stuff
}

Upvotes: 1

Related Questions