Reputation: 117
I'm running a pyspark job on spark (single node, stand-alone) and trying to save the output in a text file in the local file system.
input = sc.textFile(inputfilepath)
words = input.flatMap(lambda x: x.split())
wordCount = words.countByValue()
wordCount.saveAsTextFile("file:///home/username/output.txt")
I get an error saying
AttributeError: 'collections.defaultdict' object has no attribute 'saveAsTextFile'
Basically whatever I add to 'wordCount' object, for example collect() or map() it returns the same error. The code works with no problem when output goes to the terminal (with a for loop) but I can't figure what is missing to send the output to a file.
Upvotes: 0
Views: 1617
Reputation: 9768
The countByValue()
method that you're calling is returning a dictionary of word counts. This is just a standard python dictionary, and doesn't have any Spark methods available to it.
You can use your favorite method to save the dictionary locally.
Upvotes: 1