Reputation: 45
i try several method like: 1) set env Vars
export PYSPARK_DRIVER_PYTHON=/python_path/bin/python
export PYSPARK_PYTHON=/python_path/bin/python
not work. i'm sure PYSPARK_DRIVER_PYTHON PYSPARK_PYTHON env set success use:
env | grep PYSPARK_PYTHON
i want to pyspark use
/python_path/bin/python
as the starting python interpreter
but worker start use the :
python -m deamon
i don't want to link default python to /python_path/bin/python in the fact that this may affect other devs, bcz default python and /python_path/bin/python is not same version, and both in production use.
Also set spark-env.sh not works:
spark.pyspark.driver.python=/python_path/bin/python spark.pyspark.python=/python_path/bin/python
when start driver some warning logs like:
conf/spark-env.sh: line 63: spark.pyspark.driver.python=/python_path/bin/python: No such file or directory conf/spark-env.sh: line 64: spark.pyspark.python=/python_path/bin/python: No such file or directory
Upvotes: 1
Views: 2895
Reputation: 877
1) Check permissions on your python directory. Maybe Spark doesn't have correct permissions. Try to do: sudo chmod -R 777 /python_path/bin/python
2) Spark documentation says:
Property spark.pyspark.python take precedence if it is set.
So try also set spark.pyspark.python
in conf/spark-defaults.conf
.
3) Also if you use cluster with more then one node you need to check if Python is installed in a correct directory on each node because you don't know where workers will be started.
4) Spark will use the first Python interpreter available on your system PATH, so like workaround you can set the path to your python in PYTHON variable.
Upvotes: 0