Reputation: 1067
I am using configuration file according to guides Configure Spark to setup EMR configuration on AWS, for example, changing the spark.executor.extraClassPath
is via the following settings:
{
"Classification": "spark-defaults",
"Properties": {
"spark.executor.extraClassPath": "/home/hadoop/mongo-hadoop-spark.jar",
}
}
It works prefect and do change spark.executor.extraClassPath
on emr spark conf, but emr has some preset default paths in spark.executor.extraClassPath
, so instead of overwriting the spark.executor.extraClassPath
.I would like to know if there is a way to append the path and keep the default paths such as
{
"Classification": "spark-defaults",
"Properties": {
"spark.executor.extraClassPath": "{$extraClassPath}:/home/hadoop/mongo-hadoop-spark.jar",
}
}
Upvotes: 8
Views: 7983
Reputation: 1166
You can specify it in your emr template as follows
Classification: spark-defaults
ConfigurationProperties:
spark.jars: Your jar location
Upvotes: 1
Reputation: 1319
You can put "spark.jars" in spark-defaults.conf
so even if you are using notebook this configuration will be used. Hope it will solve your problem
Upvotes: 0
Reputation: 1026
Specifying full path for all additional jars while job sumit will work for you.
-- jars
This option Will submit these jars to all the executors and will not change default extra classpath
One more option I know but I only tried it with Yarn conf not sure about EMR though
./bin/spark-submit --class "SparkTest" --master local[*] --jars /fullpath/first.jar,/fullpath/second.jar /fullpath/your-program.jar
Upvotes: 0