Reputation: 39
I am quite new to Python. I have Anaconda3 – 4.4.0 installed with a PySpark kernel (Spark 2.2.0).
I am trying to test a simple script using a simple text file on my Windows 7 OS to make sure some of the capabilities of my Python installation works.
Here is my script:
word_counts = ('C:\\Users\\oakins1p\\WeeklyMeeting.txt') \
.flatMap(lambda line: line.split()) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a + b)\
.saveAsTextFile('C:\\Users\\oakins1p\\WeeklyMeetingOutput.txt')\`
I keep getting a AttributeError: 'str' object has no attribute 'flatMap'
and I am not sure how to resolve this.
Upvotes: 1
Views: 2243
Reputation: 73366
word_counts
is a string, where one doesn't simply call flatMap()
on it.
Try this reading the file with textFile() first, like this:
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
word_counts = sc.textFile(filepath).flatMap()...
inspired by this example.
Upvotes: 4
Reputation: 191728
You forgot to read the file. Try using textFile()
function of the SparkContext.
Upvotes: 1