makansij
makansij

Reputation: 9875

How to set spark driver maxResultSize when in client mode in pyspark?

I know when you are in client mode in pyspark, you cannot set configurations in your script, because the JVM gets started as soon as the libraries are loaded.

So, the way to set the configurations is to actually go and edit the shell script that launches it: spark-env.sh...according to this documentation here.

If I want to change the maximum results size at the driver, I would normally do this: spark.driver.maxResultSize. What is the equivalent to that in the spark-env.sh file ?

Some of the environmental variables are easy to set, such as SPARK_DRIVER_MEMORY is clearly the setting for spark.driver.memory, but what is the environmental variable for spark.driver.maxResultSize ? Thank you.

Upvotes: 4

Views: 11715

Answers (1)

Rockie Yang
Rockie Yang

Reputation: 4925

The configuration file is conf/spark-default.conf.

If conf/spark-default.conf does not exist

cp conf/spark-defaults.conf.template conf/spark-defaults.conf

Add configuration like

spark.driver.maxResultSize  2g

There are many configuration available, refer Spark Configuration

Upvotes: 4

Related Questions