Reputation: 245
I have a tuple like.. (a, list(b,c,d))
. I want the output like
(a,b)
(a,c)
(a,d)
I am trying to use flatMap for this but not getting any success. Even map is not helping in this case.
Input Data :
Chap01:Spark is an emerging technology
Chap01:You can easily learn Spark
Chap02:Hadoop is a Bigdata technology
Chap02:You can easily learn Spark and Hadoop
Code:
val rawData = sc.textFile("C:\\wc_input.txt")
val chapters = rawData.map(line => (line.split(":")(0), line.split(":")(1)))
val chapWords = chapters.flatMap(a => (a._1, a._2.split(" ")))
Upvotes: 1
Views: 874
Reputation: 245
This scenario can be easily handled by flatMapValues methods in RDD. It works only on values of a pair RDD keeping the key same.
Upvotes: 0
Reputation: 3191
You could map over the second element of the tuple:
val t = ('a', List('b','c','d'))
val res = t._2.map((t._1, _))
The snipped above resolves to:
res: List[(Char, Char)] = List((a,b), (a,c), (a,d))
Upvotes: 2