Reputation: 9875
I know when you are in client mode in pyspark, you cannot set configurations in your script, because the JVM gets started as soon as the libraries are loaded.
So, the way to set the configurations is to actually go and edit the shell script that launches it: spark-env.sh
...according to this documentation here.
If I want to change the maximum results size at the driver, I would normally do this: spark.driver.maxResultSize
. What is the equivalent to that in the spark-env.sh
file ?
Some of the environmental variables are easy to set, such as SPARK_DRIVER_MEMORY
is clearly the setting for spark.driver.memory
, but what is the environmental variable for spark.driver.maxResultSize
? Thank you.
Upvotes: 4
Views: 11715
Reputation: 4925
The configuration file is conf/spark-default.conf
.
If conf/spark-default.conf
does not exist
cp conf/spark-defaults.conf.template conf/spark-defaults.conf
Add configuration like
spark.driver.maxResultSize 2g
There are many configuration available, refer Spark Configuration
Upvotes: 4