swinefish
swinefish

Reputation: 561

Oozie spark action error: Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1]

I am currently setting up an Oozie workflow that uses a Spark action. The Spark code that I use works correctly, tested on both local and YARN. However, when running it as an Oozie workflow I am getting the following error:

Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1]

Having read up on this error, I saw that the most common cause was a problem with Oozie sharelibs. I have added all Spark jar files to the Oozie /user/oozie/share/lib/spark on hdfs, restarted Oozie and run sudo -u oozie oozie admin -oozie http://192.168.26.130:11000/oozie -sharelibupdate to ensure the sharelibs are properly updated. Unforunately none of this has stopped the error occurring.

My workflow is as follows:

<workflow-app xmlns='uri:oozie:workflow:0.4' name='SparkBulkLoad'>
    <start to = 'bulk-load-node'/>
    <action name = 'bulk-load-node'>
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <master>yarn</master>
            <mode>client</mode>
            <name>BulkLoader</name>
            <jar>${nameNode}/user/spark-test/BulkLoader.py</jar>
            <spark-opts>
                --num-executors 3 --executor-cores 1 --executor-memory 512m --driver-memory 512m\
            </spark-opts>
        </spark>
        <ok to = 'end'/>
        <error to = 'fail'/>
    </action>
    <kill name = 'fail'>
        <message>
            Error occurred while bulk loading files
        </message>
    </kill>
    <end name = 'end'/>
</workflow-app>

and job.properties is as follows:

nameNode=hdfs://192.168.26.130:8020
jobTracker=http://192.168.26.130:8050
queueName=spark
oozie.use.system.libpath=true

oozie.wf.application.path=${nameNode}/user/spark-test/workflow.xml
workflowAppUri=${nameNode}/user/spark-test/BulkLoader.py

Any advice would be greatly appreciated.

Upvotes: 1

Views: 8427

Answers (1)

Marco Massetti
Marco Massetti

Reputation: 11

I have also specified the libpath

oozie.libpath=<path>/oozie/share/lib/lib_<timestamp>

It is the value you see after the command you wrote

sudo -u oozie oozie admin -oozie http://192.168.26.130:11000/oozie -sharelibupdate

Example:

[ShareLib update status]
    sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20190328034943
    host = http://vghd08hr.dc-ratingen.de:11000/oozie
    sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20190328034943
    status = Successful

Optional: You can also specify the yarn configuration within Cloudera folder:

oozie.launcher.yarn.app.mapreduce.am.env=/opt/SP/apps/cloudera/parcels/SPARK2-2.2.0.cloudera4-1.cdh5.13.3.p0.603055/lib/spark2

BUT This might not solve the issue. The other hint I have is if you are using Spark 1.x this folder is necessary in your oozie sharelib folder

/user/oozie/share/lib/lib_20190328034943/spark2/oozie-sharelib-spark.jar

If you copy it in your spark2 folder, it solves the issue of the "missing SparkMain" but ask for other dependencies (it might be a problem in my environment). I think it worth a try, so copy and paste the lib, run your job, and see the logs.

Upvotes: 1

Related Questions