Reputation: 862
I am just getting started on spark and am running it on standalone mode over amazon EC2 instance. I was trying examples mentioned in the documentation and while going through this example called Simple App I keep getting this error: NameError: name 'numAs' is not defined
from pyspark import SparkContext
logFile = "$YOUR_SPARK_HOME/README.md" # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print "Lines with a: %i, lines with b: %i" % (numAs, numBs)
How do I integrate an editor into spark instead of using this dynamic python shell? Why do I keep getting this error?
Thanks for any help/guidance.
Upvotes: 0
Views: 708
Reputation: 2318
put your all your python code in a .py file, then submit the .py file like below:
# Run a Python application on a Spark Standalone cluster
./bin/spark-submit \
--master spark://207.184.161.138:7077 \
examples/src/main/python/pi.py \
1000
read here:
try these examples, really helping:
Upvotes: 1