Sunil Kumar B M
Sunil Kumar B M

Reputation: 2795

Oozie jobs failing Mapr 6.x

I'm trying to submit a spark job to oozie in yarn-client mode. When I run the spark job outside of oozie, it runs fine. But when I submit the oozie job, it keeps failing with the below error:

Exception in thread "main" java.lang.IllegalStateException: basedir job.jar/lib does not exist.
    at org.apache.tools.ant.DirectoryScanner.scan(DirectoryScanner.java:871)
    at org.apache.spark.classpath.ClasspathFilter$$anonfun$resolveClasspath$1.apply(ClasspathFilter.scala:47)
    at org.apache.spark.classpath.ClasspathFilter$$anonfun$resolveClasspath$1.apply(ClasspathFilter.scala:44)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
    at org.apache.spark.classpath.ClasspathFilter$.resolveClasspath(ClasspathFilter.scala:44)
    at org.apache.spark.classpath.ClasspathFilter$.main(ClasspathFilter.scala:31)
    at org.apache.spark.classpath.ClasspathFilter.main(ClasspathFilter.scala)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
    at org.apache.spark.deploy.SparkSubmitArguments.handleUnknown(SparkSubmitArguments.scala:465)
    at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:178)
    at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:104)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 5 more
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

At first, I thought it not able to load the hdfs related dependencies. So I added hadoop dependency in my classpath and submitted the job. But it didn't work.

Later, I created a uber jar of my application and tried to run. Still the same result.

If I run the same job on mapr 5.x environment, things look good and the oozie job runs successfully without any issue. But the same job is failing on Mapr 6.x env

Has anyone faced the same issue? Any help is appreciated.

Here are some important details:

Mapr version : 6.0.1
Spark version: 2.2.1
Oozie version: 4.3.0
Hadoop version: 2.7.0

Upvotes: 2

Views: 517

Answers (1)

Sunil Kumar B M
Sunil Kumar B M

Reputation: 2795

I was finally able to resolve the issue.

The issue was with mapr-spark.env.sh

Here the values for MAPR_HADOOP_CLASSPATH was set to `/opt/mapr/spark/spark-2.2.1/bin/mapr-classpath.sh`

I changed the value to MAPR_HADOOP_CLASSPATH=`hadoop classpath`. This was able to load the hadoop libraries (especially hdfs) properly and the oozie jobs ran successfully.

Upvotes: 1

Related Questions