Reputation: 2040
I have a PairRDD in the form RDD[(String, Array[String])]
. I want to flatten the values so that I have an RDD[(String, String)]
where each of the elements in the Array[String] of the first RDD become a dedicated element in the 2nd RDD.
For instance, my first RDD has the following elements:
("a", Array("x", "y"))
("b", Array("y", "z"))
The result I want is this:
("a", "x")
("a", "y")
("b", "y")
("b", "z")
How can I do this? flatMapValues(f: Array[String] => TraverableOnce[String])
seems to be the right choice here, but what do I need to use as argument f
?
Upvotes: 0
Views: 1919
Reputation: 2040
To achieve the desired result, do:
val rdd1: RDD[(Any, Array[Any])] = ...
val rddFlat: RDD[(Any, Any)] = rdd1.flatMapValues(identity[Array[Any]])
The result looks like the one asked for in the question.
Upvotes: 4