quiet_penguin
quiet_penguin

Reputation: 840

Embedding Pig into Python

I am trying to embed a pig script in Python and am encountering an exception and can't seem to find what the problem is. I have a Python script with pig script embedded in it and have Apache PIG 0.10 installed. I can run pig scripts from the shell and it works ok. when I run the python script with pig embedded from shell using command

pig -x mapreduce pythonscript.py it gives me the error

Error before Pig is launched ---------------------------- ERROR 2998: Unhandled internal error. org/python/util/PythonInterpreter

java.lang.NoClassDefFoundError: org/python/util/PythonInterpreter at org.apache.pig.scripting.jython.JythonScriptEngine.main(JythonScriptEngine.java:338)

I have tried adding Jython jar to the $PIG_CLASSPATH environment variable at shell before running pig command. It does not help.

I see that others are also encountering this problem but, has anyone found a solution? Any pointers?

Upvotes: 0

Views: 1762

Answers (1)

quiet_penguin
quiet_penguin

Reputation: 840

Ok. Have found the solution. If you are also seeing this error then I hope this helps.

1) Downloaded the Jython installer jar. 2) ran it with java -jar 3) Specify a location for the installation 4) Added the Jython executable shell script to my PATH environment variable. 5) Copied the jython jar from installation folder to HADOOP_HOME/lib folder. ie. lib folder under hadoop.

Mostly the step 5 is the deal maker. But these are the steps I followed. It seems that copying/setting the Jython jar to PIG does not seem to help. I am running Hadoop in pseudo cluster mode with Pig on top of it. And Pig seems to take the HADOOP based jars rather than its own lib!!

After this it runs like a charm.

Upvotes: 1

Related Questions