Reputation: 2188
Where is a list of all (valid, built-in) Spark properties?
The list of Available Properties on the official Spark documentation does not include all (valid, built-in) properties for the current stable version of Spark (2.4.4 as of 2020-01-22). An example is spark.sql.shuffle.partitions
, which defaults to 200
. Unfortunately, properties like this one do not appear to be accessible via any of sparkConf.getAll()
, sparkConf.toDebugString()
, or sql.("SET -v")
.
Rather, built-in defaults appear to be accessible only by explicit name (i.e. sparkConf.get("foo")
). However, this does not help me since the exact property name must be already known, and I need to survey properties that I don't already know about for debugging/optimization/support purposes.
Upvotes: 4
Views: 2080
Reputation: 933
you can use.
sql("SET -v").show(500,false)
Which will give you a near complete list not including the internal properties.
+-----------------------------------------------------------------+-------------------------------------------------+
|key |value |
+-----------------------------------------------------------------+-------------------------------------------------+
|spark.sql.adaptive.enabled |false |
|spark.sql.adaptive.shuffle.targetPostShuffleInputSize |67108864b |
|spark.sql.autoBroadcastJoinThreshold |10485760 |
|spark.sql.avro.compression.codec |snappy |
|spark.sql.avro.deflate.level |-1 |
...
Edit:
I will mention that spark has many baked in defaults that may not show up as config properties. Id suggest taking a look at the SQLConf class in the source code. Unfortunately due to the complexity of spark and its nearly untold number of configs not all are in the SQLConf some are scattered throughout the code. Lately sometimes spark often times has multiple configs that override each other and this can also only be deduced via the source code.
Upvotes: 3
Reputation: 39473
I don't think this is the complete answer, but it can help. It will show more properties than your alternatives. At least will show options modified by some kind of middle ware, like Livy.
Set this parameter:
spark.logConf=true
Now all your session configuration will be saved in yarn log at level INFO. Do a yarn logs -applicattionID <your app id>
and search for spark.app.name=
to find your session properties.
Another problem is that you will see the properties values just after executing the job.
Upvotes: 1