azouritis
azouritis

Reputation: 85

bootstrap script for downloadind jar dependencies in EMR spark cluster

I want to do something really simple (I believe) I run my custom jar in EMR Spark. Right now I do something like

sbt assembly

and it is created a fat jar (80MB-120MB), which is a pain to uploaded in S3.

What I want, is to use

sbt pack

To get all the jars in a folder, upload once in S3 and then, every time I want to upload a new jar would upload the compiled, without the dependencies.

I believe that it could happen with a bootstrap.sh which will copy all the jars to the cluster and then use the --jars parameter.

Has anyone done that?

Upvotes: 2

Views: 1432

Answers (2)

Tomer
Tomer

Reputation: 1108

Here is an example: First, create a bootstrap.sg script.

    sudo wget http://dl.bintray.com/spark-packages/maven/graphframes/graphframes/0.6.0-spark2.3-s_2.11/graphframes-0.6.0-spark2.3-s_2.11.jar -P /usr/lib/spark/jars/
    sudo wget http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-api_2.11/2.1.2/scala-logging-api_2.11-2.1.2.jar -P /usr/lib/spark/jars/
    sudo wget http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-slf4j_2.11/2.1.1/scala-logging-slf4j_2.11-2.1.1.jar -P /usr/lib/spark/jars/
    sudo wget https://dl.bintray.com/spark-packages/maven/neo4j-contrib/neo4j-spark-connector/2.2.1-M5/neo4j-spark-connector-2.2.1-M5.jar -P /usr/lib/spark/jars/

Upload bootstrap.sh to S3, let's say into 'your_bucket'
Finally, in your EMR creation script add this row:

--bootstrap-actions Path="s3://your_bucket/bootstrap.sh"

Upvotes: 2

WoodChopper
WoodChopper

Reputation: 4375

In build.sbt add dependencies like spark-core,spark-sql as provided

"org.apache.spark" %% "spark-core" % "1.5.1" % "provided",
"org.apache.spark" %% "spark-sql" % "1.5.1" % "provided",

and also you could add other dependence as provided during compile time and building assembly jars. Then as you mentioned during spark-submit you could add dependencies like,

--jars a.jar,b.jar

Upvotes: 1

Related Questions