Maayan
Maayan

Reputation: 303

Spark 2.1.1 with typesafeconfig

I'm trying to support some external configuration file for my spark application using typesafeconfig.

I'm loading the application.conf file in my application code like this (driver):

val config = ConfigFactory.load()
val myProp = config.getString("app.property")
val df = spark.read.avro(myProp)

application.conf looks like this:

app.propety="some value"

spark-submit execution looks like this:

spark-submit 
        --class com.myapp.Main \
        --conf spark.shuffle.service.enabled=true \
        --conf spark.dynamicAllocation.enabled=true \
        --conf spark.dynamicAllocation.minExecutors=56 \
        --conf spark.dynamicAllocation.maxExecutors=1000 \
        --driver-class-path $HOME/conf/*.conf \
        --files $HOME/conf/application.conf \
        my-app-0.0.1-SNAPSHOT.jar

seems it doesn't work and I'm getting:

Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'app'
    at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:124)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:147)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:159)
    at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:164)
    at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:206)
    at com.paypal.cfs.fpti.Main$.main(Main.scala:42)
    at com.paypal.cfs.fpti.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

looking at the logs i do see that "--files" work, seems like a classpath issue...

18/03/13 01:08:30 INFO SparkContext: Added file file:/home/user/conf/application.conf at file:/home/user/conf/application.conf with timestamp 1520928510820
18/03/13 01:08:30 INFO Utils: Copying /home/user/conf/application.conf to /tmp/spark-2938fde1-fa4a-47af-8dc6-1c54b5e89d48/userFiles-c2cec57f-18c8-491d-8679-df7e7da45e05/application.conf

Upvotes: 3

Views: 1398

Answers (2)

Maayan
Maayan

Reputation: 303

Turns out I was pretty close to the answer to begin with... here is how it worked for me:

spark-submit \
    --class com.myapp.Main \
    --conf spark.shuffle.service.enabled=true \
    --conf spark.dynamicAllocation.enabled=true \
    --conf spark.dynamicAllocation.minExecutors=56 \
    --conf spark.dynamicAllocation.maxExecutors=1000 \
    --driver-class-path $APP_HOME/conf \
    --files $APP_HOME/conf/application.conf \
    $APP_HOME/my-app-0.0.1-SNAPSHOT.jar

then $APP_HOME will contain the below:

conf/application.conf
my-app-0.0.1-SNAPSHOT.jar

I guess you need to make sure the application.conf is placed inside a folder, that is the trick.

Upvotes: 2

Álvaro Valencia
Álvaro Valencia

Reputation: 1217

In order to specify the config file path, you may pass it as an application argument, and then read it from the args variable of the main class.

This is how you would execute the spark-submit command. Note that I've specified the config file after the application jar.

spark-submit 
        --class com.myapp.Main \
        --conf spark.shuffle.service.enabled=true \
        --conf spark.dynamicAllocation.enabled=true \
        --conf spark.dynamicAllocation.minExecutors=56 \
        --conf spark.dynamicAllocation.maxExecutors=1000 \
        my-app-0.0.1-SNAPSHOT.jar $HOME/conf/application.conf

And then, load the config file from the path specified in args(0):

import com.typesafe.config.ConfigFactory
[...]
val dbconfig = ConfigFactory.parseFile(new File(args(0))

Now you have access to the properties of your application.conf file.

val myProp = config.getString("app.property")

Hope it helps.

Upvotes: 0

Related Questions