Reputation: 75
I wrote some mapreduce
jobs that reference a few external jars.
so I added them into the CLASSPATH of the "running" cluster in order to run jobs.
Once I tried to run them, I got class not found exceptions. I Googled for ways to fix it and I found that I needed to restart the cluster for applying the changed CLASSPATH, and it actually worked.
Oh, yuck! Should I really need to restart a cluster every time I add new jars into the CLASSPATH? I don't think that it makes sense.
Does anyone know how to apply the changes without restarting them?
I think I need to add some detail to beg your advice.
I wrote a custom hbase
filter class and packed it in a jar.
And I wrote a mapreduce
job that uses the custom filter class and packed it in an another jar.
Because the filter class jar wasn't in the class path of my "running" cluster, I added it.
But I couldn't succeed to run the job until I restarted the cluster.
Of course, I know I could packed the filter class and the job in a single jar together. But I didn't mean it. And I'm curious I should restart the cluster again if I need to add new external jars?
Upvotes: 1
Views: 2792
Reputation: 33495
Check the Cloudera article for including 3rd party libraries required for the Job. Option (1) and (2) don't require the Cluster to be restarted.
Upvotes: 3
Reputation: 2637
You could have such a system that dynamically resolve class names to an interface type to process your data.
Just my 2 cents.
Upvotes: -1