ZhouQuan
ZhouQuan

Reputation: 1067

EMR conf spark-default settings

I am using configuration file according to guides Configure Spark to setup EMR configuration on AWS, for example, changing the spark.executor.extraClassPath is via the following settings:

{
     "Classification": "spark-defaults",
     "Properties": {
         "spark.executor.extraClassPath": "/home/hadoop/mongo-hadoop-spark.jar",
     }
}

It works prefect and do change spark.executor.extraClassPath on emr spark conf, but emr has some preset default paths in spark.executor.extraClassPath, so instead of overwriting the spark.executor.extraClassPath.I would like to know if there is a way to append the path and keep the default paths such as

{
     "Classification": "spark-defaults",
     "Properties": {
         "spark.executor.extraClassPath": "{$extraClassPath}:/home/hadoop/mongo-hadoop-spark.jar",
     }
}

Upvotes: 8

Views: 7983

Answers (3)

Emerson
Emerson

Reputation: 1166

You can specify it in your emr template as follows

Classification: spark-defaults
          ConfigurationProperties:
            spark.jars: Your jar location

Upvotes: 1

Sachin Janani
Sachin Janani

Reputation: 1319

You can put "spark.jars" in spark-defaults.conf so even if you are using notebook this configuration will be used. Hope it will solve your problem

Upvotes: 0

sandesh dahake
sandesh dahake

Reputation: 1026

Specifying full path for all additional jars while job sumit will work for you.

-- jars

This option Will submit these jars to all the executors and will not change default extra classpath

One more option I know but I only tried it with Yarn conf not sure about EMR though

./bin/spark-submit --class "SparkTest" --master local[*] --jars /fullpath/first.jar,/fullpath/second.jar /fullpath/your-program.jar

Upvotes: 0

Related Questions