Reputation: 2393
In Pyspark, when I try to print a list, I get all the elements printed in the same line :
>>> wordslist = words.collect();
>>> wordslist
[(u'crazy', 1), (u'fox', 1), (u'jumped', 1)]
Is there any way I could get the output printed one item per line, like this :
>>> wordslist
[
(u'crazy', 1),
(u'fox', 1),
(u'jumped', 1)
]
Upvotes: 0
Views: 15145
Reputation: 40370
This is basic python. When you collect a result from an RDD. You obtain a list which you can iterate on and print each element in the format you wish.
I think that the question on how to print a list had been answered so many times in SO.
And here is one example.
$> mylist = myrdd.collect()
$> for elem in mylist:
$> print elem
You'd also want to check pyspark documentation .
Upvotes: 1