samuel281
samuel281

Reputation: 75

How to add jars into the classpath and get effected without restarting the hadoop cluster?

I wrote some mapreduce jobs that reference a few external jars. so I added them into the CLASSPATH of the "running" cluster in order to run jobs.

Once I tried to run them, I got class not found exceptions. I Googled for ways to fix it and I found that I needed to restart the cluster for applying the changed CLASSPATH, and it actually worked.

Oh, yuck! Should I really need to restart a cluster every time I add new jars into the CLASSPATH? I don't think that it makes sense.

Does anyone know how to apply the changes without restarting them?


I think I need to add some detail to beg your advice.

I wrote a custom hbase filter class and packed it in a jar. And I wrote a mapreduce job that uses the custom filter class and packed it in an another jar. Because the filter class jar wasn't in the class path of my "running" cluster, I added it. But I couldn't succeed to run the job until I restarted the cluster.

Of course, I know I could packed the filter class and the job in a single jar together. But I didn't mean it. And I'm curious I should restart the cluster again if I need to add new external jars?

Upvotes: 1

Views: 2792

Answers (2)

Praveen Sripati
Praveen Sripati

Reputation: 33495

Check the Cloudera article for including 3rd party libraries required for the Job. Option (1) and (2) don't require the Cluster to be restarted.

Upvotes: 3

r0ast3d
r0ast3d

Reputation: 2637

You could have such a system that dynamically resolve class names to an interface type to process your data.

Just my 2 cents.

Upvotes: -1

Related Questions