Reputation: 3585
How to trigger a jar working on Hadoop from a simple jar, so that it uses HDFS, Actully, I am manually running this command bin/hadoop jar ~/wordcount_classes/word.jar org.myorg.WordCount ~/hadoop-0.20.203.0/input1 ~/hadoop-0.20.203/output2 in which I have provided Input and Output directory in HDFS and I am using word.jar here, I want to make it such that it automatically gets triggered from Java Project.
Upvotes: 1
Views: 1086
Reputation: 5402
I'm working on the same problem. I have a program (let's call it Driver) that must implement the following method:
public void runJar(File jar, String mainClass, File inputDir, File outputDir);
To do this, I was calling org.apache.hadoop.util.RunJar.main(String[])
which is what your command-line is calling. This works great only if you're running Driver from the command line.
If Driver is running inside a container (like Tomcat or Jetty), you're going to have a problem. You'll get errors like
java.lang.ClassNotFoundException: org.apache.hadoop.fs.Path
This is because of how RunJar
messes with classloaders. You need to manually create a classloader like so:
final ClassLoader original = Thread.currentThread().getContextClassLoader();
try {
URL[] urls = new URL[] { jar.toURI().toURL() };
ClassLoader loader = new URLClassLoader(urls, originalLoader);
Thread.currentThread().setContextClassLoader(loader);
Class<?> mainClass = Class.forName(driverClass, true, loader);
Class[] argTypes = new Class[]{ Array.newInstance(String.class, 0).getClass()};
Method main = mainClass.getMethod("main", argTypes);
main.invoke(null, new Object[] { args });
} finally {
Thread.currentThread().setContextClassLoader(original);
}
Upvotes: 0
Reputation: 8088
In best of my understanding all you asking for is done by the Main of your jar. It read parameters, creates job configuration, sets input and output formats and finally runs the job.
Upvotes: 1