Reputation: 9148
I am starting a pyspark jupyter notebook with a script:
#!/bin/bash
ipaddres=...
echo "Start notebook server at IP address $ipaddress"
function snotebook ()
{
#Spark path (based on your computer)
SPARK_PATH=/home/.../software/spark-2.3.1-bin-hadoop2.7
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
# For python 3 users, you have to add the line below or you will get an error
export PYSPARK_PYTHON=python3
$SPARK_PATH/bin/pyspark --master local[10]
}
snotebook --no-browser --ip $ipaddress --certfile=/home/.../local/mycert.pem --keyfile /home/.../local/mykey.key
I wonder how to set the port. Is there an environment variable that I can set? I would like to determine the port before the notebook starts. I tried --port 7999
.
Upvotes: 0
Views: 3017
Reputation: 191701
If you mean Spark UI ports, in the spark-env.sh
, it lists these two environment variables that you can overwrite, or set in that file
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
I'm not sure the Jupyter values or if PySpark even passes them through, but if jupyter notebook --port
works on its own, then I would try
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --port=7999"
If you want to pass all the argument from snotebook
into the variable, then you need
export PYSPARK_DRIVER_PYTHON_OPTS="notebook $@"
Upvotes: 1