Michael Lihs
Michael Lihs

Reputation: 8220

How to pass parameters / properties to Spark jobs with spark-submit

I am running a Spark job implemented in Java using spark-submit. I would like to pass parameters to this job - e.g. a time-start and time-end parameter to parametrize the Spark application.

What I tried was using the

--conf key=value

option of the spark-submit script, but when I try to read the parameter in my Spark job with

sparkContext.getConf().get("key")

I get an exception:

Exception in thread "main" java.util.NoSuchElementException: key

Furthermore, when I use sparkContext.getConf().toDebugString() I don't see my value in the output.

Further Notice Since I want to submit my Spark Job via the Spark REST Service I cannot use an OS Environment Variable or the like.

Is there any possibility to implement this?

Upvotes: 12

Views: 30884

Answers (3)

renzherl
renzherl

Reputation: 161

You can pass parameters like this:

./bin/spark-submit \
  --class $classname \
  --master XXX \
  --deploy-mode XXX \
  --conf XXX \
  $application-jar --**key1** $**value** --**key2** $**value2**\

Make sure to replace key1, key2 and value with proper values.

Upvotes: 2

user6022341
user6022341

Reputation:

Spark configuration will use only keys in the spark namespace. If you don't won't to use independent configuration tool you can try:

--conf spark.mynamespace.key=value

Upvotes: 6

VladoDemcak
VladoDemcak

Reputation: 5259

Since you want to use your custom properties you need to place your properties after application.jar in spark-submit (like in spark example [application-arguments] should be your properties. --conf should be spark configuration properties.

--conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown).

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # options
  <application-jar> \
  [application-arguments] <--- here our app arguments

so when you do: spark-submit .... app.jar key=value in main method you will get args[0] as key=value.

public static void main(String[] args) {
    String firstArg = args[0]; //eq. to key=value
}

but you want to use key value pairs you need to parse somehow your app arguments.

You can check Apache Commons CLI library or some alternative.

Upvotes: 12

Related Questions