How to add jars into the classpath and get effected without restarting the hadoop cluster?

Question

I wrote some mapreduce jobs that reference a few external jars. so I added them into the CLASSPATH of the "running" cluster in order to run jobs.

Once I tried to run them, I got class not found exceptions. I Googled for ways to fix it and I found that I needed to restart the cluster for applying the changed CLASSPATH, and it actually worked.

Oh, yuck! Should I really need to restart a cluster every time I add new jars into the CLASSPATH? I don't think that it makes sense.

Does anyone know how to apply the changes without restarting them?

I think I need to add some detail to beg your advice.

I wrote a custom hbase filter class and packed it in a jar. And I wrote a mapreduce job that uses the custom filter class and packed it in an another jar. Because the filter class jar wasn't in the class path of my "running" cluster, I added it. But I couldn't succeed to run the job until I restarted the cluster.

Of course, I know I could packed the filter class and the job in a single jar together. But I didn't mean it. And I'm curious I should restart the cluster again if I need to add new external jars?

Praveen Sripati · Accepted Answer

Check the Cloudera article for including 3rd party libraries required for the Job. Option (1) and (2) don't require the Cluster to be restarted.

How to add jars into the classpath and get effected without restarting the hadoop cluster?

Answers (2)

Related Questions