Reputation: 2566
I'm writing a spark application, and using sbt assembly to create a fat jar, which I can send to spark-submit (through Amazon EMR).
My application uses typesafe-config, with a reference.conf
file inside my resources
directory.
My jar is on Amazon S3, and I use the command aws emr add-steps..
to create a new spark job (which downloads the jar to the cluster and sends it to spark-submit).
I know that in general, I can use application.conf
to override the settings. However, since I'm using spark (and a fat jar), I need some way to deploy my override.
What is the recommended way of overriding the application config settings when using spark?
Upvotes: 4
Views: 1550
Reputation: 2094
You can use spark-submit... --conf my.app.config.value=50 --conf config.file=other.conf ... fat.jar
When using typesafe.config.ConfigFactory.load(), values specified on the command line will override theses specified in 'other.conf' which in turn override thoses specified in 'reference.conf' in your fatjar.
Upvotes: 5
Reputation: 2305
In my spark java code I write like this to override application config
SparkConf sparkConf = new SparkConf();
sparkConf.setMaster(sparkMaster);
sparkConf.set("spark.executor.memory", "1024M");
sparkConf.set("spark.default.parallelism", "48");
sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
Upvotes: 0