MosesA
MosesA

Reputation: 965

Setting up PySpark

I have Scala and Spark installed and working but PySpark isn't working. Here's the out put Im getting:

user@ubuntu:~/spark$ pyspark 
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
  File "/home/user/spark/python/pyspark/shell.py", line 30, in <module>
    import pyspark
  File "pyspark.py", line 1, in <module>
NameError: name 'sc' is not defined

Here's my .bashrc:

export SPARK_HOME=/home/user/spark
export PATH=$PATH:$SPARK_HOME/bin:$PATH
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH

What am I doing wrong?

Thanks

Upvotes: 2

Views: 1170

Answers (2)

Anthony
Anthony

Reputation: 1543

I couldn't reproduce the problem but nevertheless didn't see why it's necessary to set the SPARK_HOME, PATH and PYTHONPATH. If pyspark is started with path, then SparkContext should already be created.
If you start with ipython or python, you can use findspark package to locate Spark and create SparkContext

$ python
>>> import findspark
>>> findspark.init('/home/user/spark')
>>> from pyspark import SparkContext
>>> sc = SparkContext('local[4]', 'myapp')

Upvotes: 2

zero323
zero323

Reputation: 330063

It looks like you have an import conflict. Somewhere in you path there is pyspark.py file which is picked before actual pyspark package.

Upvotes: 2

Related Questions