Reputation: 583
I'm trying to configure apache-spark on MacOS. All the online guides ask to either download the spark tar and set up some env variables or to use brew install apache-spark
and then setup some env variables.
Now I installed apache-spark using brew install apache-spark
.
I run pyspark
in terminal and I am getting a python prompt which suggests that the installation was successful.
Now when I try to do import pyspark
into my python file, I'm facing error saying ImportError: No module named pyspark
The strangest thing I'm not able to understand is how is it able to start an REPL of pyspark and not able to import the module into python code.
I also tried doing pip install pyspark
but it does not recognize the module either.
In addition to installing apache-spark with homebrew, I've set up following env variables.
if which java > /dev/null; then export JAVA_HOME=$(/usr/libexec/java_home); fi
if which pyspark > /dev/null; then
export SPARK_HOME="/usr/local/Cellar/apache-spark/2.1.0/libexec/"
export PYSPARK_SUBMIT_ARGS="--master local[2]"
fi
Please suggest what exactly is missing on my setup to run pyspark code on my local machine.
Upvotes: 0
Views: 3552
Reputation: 4719
sorry I dont use MAC , but there is another way in linux beside above answer:
sudo ln -s $SPARK_HOME/python/pyspark /usr/local/lib/python2.7/site-packages
Python will read module from /path/to/your/python/site-packages at last
Upvotes: 1
Reputation: 454
pyspark module is not include in your python
Try this instead
import os
import sys
os.environ['SPARK_HOME'] = "/usr/local/Cellar/apache-spark/2.1.0/libexec/"
sys.path.append("/usr/local/Cellar/apache-spark/2.1.0/libexec/python")
sys.path.append("/usr/local/Cellar/apache-spark/2.1.0/libexec/python/lib/py4j-0.10.4-src.zip")
try:
from pyspark import SparkContext
from pyspark import SparkConf
except ImportError as e:
print ("error importing spark modules", e)
sys.exit(1)
sc = SparkContext('local[*]','PySpark')
if you don't want that, include them into your system PATH
. And don't forget to include the python path.
export SPARK_HOME=/usr/local/Cellar/apache-spark/2.1.0/libexec/
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH
export PATH=$SPARK_HOME/python:$PATH
Upvotes: 5