Lubor
Lubor

Reputation: 999

NoClassDefFoundError:org/apache/spark/sql/hive/HiveContext

I am trying to use oozie to call spark jobs. And the spark job can be run successfully without oozie using spark-submit:

spark-submit --class xxx --master yarn-cluster --files xxx/hive-site.xml --jars xxx/datanucleus-api-jdo-3.2.6.jar,xxx/datanucleus-rdbms-3.2.9.jar,xxx/datanucleus-core-3.2.10.jar xxx.jar

But when I try to use oozie to call the job, it will always failed with the following error. I have involved the 3 external jars and hive-site.xml in the workflow.xml

Launcher exception: org/apache/spark/sql/hive/HiveContext
java.lang.NoClassDefFoundError: org/apache/spark/sql/hive/HiveContext
    at xxx$.main(xxx.scala:20)
    at xxx.main(xxx.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104)
    at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
    at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)

The 20th line of my scala code is:

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

Does anyone have any idea about this error? I have been stuck for several days.

Thank you!

Upvotes: 1

Views: 2377

Answers (1)

Lubor
Lubor

Reputation: 999

Just came back to answer my own question. This one ends up being solved by updating the shared lib of oozie. Basically, the jars in the shared lib are not complete for my job to run. So I first imported some additional jars such as spark-hive and spark-mllib. Also the jars provided in oozie shared lib were too old, which also needed to be updated to avoid some potential errors.

Upvotes: 1

Related Questions