Rapcetera
Rapcetera

Reputation: 39

'str' object has no attribute 'flatMap'

I am quite new to Python. I have Anaconda3 – 4.4.0 installed with a PySpark kernel (Spark 2.2.0).

I am trying to test a simple script using a simple text file on my Windows 7 OS to make sure some of the capabilities of my Python installation works.

Here is my script:

 word_counts = ('C:\\Users\\oakins1p\\WeeklyMeeting.txt') \
.flatMap(lambda line: line.split()) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a + b)\
.saveAsTextFile('C:\\Users\\oakins1p\\WeeklyMeetingOutput.txt')\`

I keep getting a AttributeError: 'str' object has no attribute 'flatMap' and I am not sure how to resolve this.

Upvotes: 1

Views: 2243

Answers (2)

gsamaras
gsamaras

Reputation: 73366

word_counts is a string, where one doesn't simply call flatMap() on it.

Try this reading the file with textFile() first, like this:

from pyspark import SparkContext
sc = SparkContext.getOrCreate()
word_counts = sc.textFile(filepath).flatMap()...

inspired by this example.

Upvotes: 4

OneCricketeer
OneCricketeer

Reputation: 191728

You forgot to read the file. Try using textFile() function of the SparkContext.

Upvotes: 1

Related Questions