nemo
nemo

Reputation: 1584

How To Refresh/Clear the DistributedCache When Using Hue + Beeswax To Run Hive Queries That Define Custom UDFs?

I've set up a Hadoop cluster (using the Cloudera distro through Cloudera Manager) and I'm running some Hive queries using the Hue interface, which uses Beeswax underneath.

All my queries run fine and I have even successfully deployed a custom UDF.

But, while deploying the UDF, I ran into a very frustrating versioning issue. In the initial version of my UDF class, I used a 3rd party class that was causing a StackOverflowError.

I fixed this error and then verified that the UDF can be deployed and used successfully from the hive command line.

Then, when I went back to using Hue and Beeswax again, I kept getting the same error. I could fix this only by changing my UDF java class name. (From Lower to Lower2).

Now, my question is, what is the proper way to deal with these kind of version issues?

From what I understand, when I add jars using the handy form fields to the left, they get added to the distributed cache. So, how do I refresh/clear the distributed cache? (I couldn't get LIST JARS; etc. to run from within Hive / Beeswax. It gives me a syntax error.)

Upvotes: 7

Views: 3550

Answers (1)

Harsh J
Harsh J

Reputation: 1494

Since the classes are loaded onto the Beeswax Server JVM (same goes with HiveServer1 and HiveServer2 JVMs), deploying a new version of a jar could often require restarting these service to avoid such class loading issues.

Upvotes: 2

Related Questions