Vardan Gupta
Vardan Gupta

Reputation: 3585

Trigger Hadoop Command by JAVA code

How to trigger a jar working on Hadoop from a simple jar, so that it uses HDFS, Actully, I am manually running this command bin/hadoop jar ~/wordcount_classes/word.jar org.myorg.WordCount ~/hadoop-0.20.203.0/input1 ~/hadoop-0.20.203/output2 in which I have provided Input and Output directory in HDFS and I am using word.jar here, I want to make it such that it automatically gets triggered from Java Project.

Upvotes: 1

Views: 1086

Answers (2)

Steve Armstrong
Steve Armstrong

Reputation: 5402

I'm working on the same problem. I have a program (let's call it Driver) that must implement the following method:

public void runJar(File jar, String mainClass, File inputDir, File outputDir);

To do this, I was calling org.apache.hadoop.util.RunJar.main(String[]) which is what your command-line is calling. This works great only if you're running Driver from the command line.

If Driver is running inside a container (like Tomcat or Jetty), you're going to have a problem. You'll get errors like

java.lang.ClassNotFoundException: org.apache.hadoop.fs.Path

This is because of how RunJar messes with classloaders. You need to manually create a classloader like so:

final ClassLoader original = Thread.currentThread().getContextClassLoader();
try {
  URL[] urls = new URL[] { jar.toURI().toURL() };
  ClassLoader loader = new URLClassLoader(urls, originalLoader);
  Thread.currentThread().setContextClassLoader(loader);

  Class<?> mainClass = Class.forName(driverClass, true, loader);
  Class[] argTypes = new Class[]{ Array.newInstance(String.class, 0).getClass()};
  Method main = mainClass.getMethod("main", argTypes);
  main.invoke(null, new Object[] { args });
} finally {
  Thread.currentThread().setContextClassLoader(original);
}

Upvotes: 0

David Gruzman
David Gruzman

Reputation: 8088

In best of my understanding all you asking for is done by the Main of your jar. It read parameters, creates job configuration, sets input and output formats and finally runs the job.

Upvotes: 1

Related Questions