Afaq
Afaq

Reputation: 1155

Force java jar to not use classpath packages on EMR

I am trying to run a fat jar through spark-submit on EMR. I am running into a problem related to package dependencies. This project depends on google adwords library which I have included in build.sbt. The problem is that google adwords library internally depends on a package called commons-configuration version 1.10 but when I run this jar on EMR through spark-submit which runs via yarn scheduler the version 1.6 of this package (commons-configuration) is used as it is part of the CLASSPATH on the EMR cluster. I get the following error

java.lang.NoSuchMethodError: org.apache.commons.configuration.MapConfiguration

I have tried passing the dependency jar explicitly using option --jars of spark-submit

spark-submit --name my-awesome-spark-job --deploy-mode cluster --class package.path.to.my.Main --jars s3://jar-bucket/jars/commons-configuration-1.10.jar s3://code-bucket/jars/spark-code.jar

Doing this still gives me the same error as the package of older version from CLASSPATH is being used not matter what. I would like to force my jar to include the dependency inside the fat jar and use them explicitly for certain libraries e.g google adwords library here. Thanks.

Upvotes: 0

Views: 1085

Answers (1)

Pietrotull
Pietrotull

Reputation: 497

You could try to shade the dependencies that you are using and that have an older version available on cluster.

What do you use to build the jar? I've used this strategy with sbt https://github.com/sbt/sbt-assembly#shading

But there is also a shade plugin for maven: https://maven.apache.org/plugins/maven-shade-plugin/

Upvotes: 1

Related Questions