Reputation: 1030
My main spark project have dependency on other utils jars.So set of combination could be like:
1. main_spark-1.0.jar will work with utils_spark-1.0.jar (some jobs use this set)
2. main_spark-2.0.jar will work with utils_spark-2.0.jar (and some of the jobs use this set)
The approch which worked for me to handle this scenario is to pass jars with spark-opt as
oozie spark action job1
<jar>main_spark-1.0.jar</jar>
<spark-opt>--jars utils_spark-1.0.jar</spark-opt>
oozie spark action job2
<jar>main_spark-2.0.jar</jar>
<spark-opt>--jars utils_spark-2.0.jar</spark-opt>
I tested this configuration in two different actions and it works. The question I have is
In my understanding both application will be running in their spark context so it should be ok but any expert advice ?
Upvotes: 0
Views: 380
Reputation: 74649
If both jobs/action run in parallel on same yarn-cluster then Is there any possibility of class loader issue (multiple versions of same jar)?
No (or at least it is not expected and if happened I'd consider it a bug).
Submitting a Spark application to a YARN cluster always ends up as a separate set of the driver and executors that all together compose a separate environment from other Spark applications.
Upvotes: 1