Reputation: 109
I have a list of lists with tokens such as:
mylist = [['hello'],
['cat'],
['dog'],
['hey'],
['dog'],
['I', 'need', 'coffee'],
['dance'],
['dream', 'job']]
myRDD = sc.parallelize(mylist)
I'm struggling to find the opperation that will result in an RDD where each row is one token. My desired output is:
[['hello'],
['cat'],
['dog'],
['hey'],
['dog'],
['I'],
['need'],
['coffee'],
['dance'],
['dream'],
['job']]
What's the right syntax for this? Thank you
Upvotes: 0
Views: 270