H. Shindoh
H. Shindoh

Reputation: 926

How can we modify PySpark configuration on Jupyter

I am currently working on a Jupyter (Lab) and PySpark 2.1.1.

I want to change spark.yarn.queue and master from a notebook. Because of the kernel spark and sc are available when I open a notebook.

Following this question, I tried

spark.conf.set("spark.yarn.queue", "my_queue")

But according to spark.sparkContext.getConf() the above line has no affect.

spark.conf.setMaster("yarn-cluster")

is not working, because there is no such a method for spark.conf.

Question: How can I change the configuration (queue and master) from a Jupyter notebook?

(Or should I set any environment variables?)

Upvotes: 0

Views: 2844

Answers (1)

martinarroyo
martinarroyo

Reputation: 9701

You can try to initialize spark beforehand, not in the notebook. Run this on your terminal:

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

pyspark --master <your master> --conf <your configuration> <or any other option that pyspark supports>.

My source

Upvotes: 1

Related Questions