David Parks
David Parks

Reputation: 32061

Hadoop: Where does the job first execute before the map tasks?

This is a typical main method of a Hadoop Job:

public class MyHadoopJobDriver extends Configured implements Tool {

  public static void main(String[] args) throws Exception {
    int exitCode = ToolRunner.run(new MyHadoopJobDriver(), args);
    System.exit(exitCode);
  }
  ...

}

When I run this job hadoop MyHadoopJobDriver, Is the code above executing in its own JVM on the task tracker? Then once the job is scheduled, the map tasks are distributed to the task trackers?

Upvotes: 0

Views: 54

Answers (1)

Chris Gerken
Chris Gerken

Reputation: 16392

Yes, usually. Note that if you "Debug -> as Java Application" that class in Eclipse then you can use the debugger for testing, setting breakpoints, etc. Note Note that even if you run the driver class and the mapper/reducer in Eclipse, you still need hadoop running on your machine in support of HDFS.

Upvotes: 1

Related Questions