Reputation: 21
How can i traverse following RDD using Spark scala. I wants to print every value present in Seq with associated key
res1: org.apache.spark.rdd.RDD[(java.lang.String, Seq[java.lang.String])] = MapPartitionsRDD[6] at groupByKey at <console>:14
I tried following code for it.
val ss=mapfile.map(x=>{
val key=x._1
val value=x._2.sorted
var i=0
while (i < value.length) {
(key,value(i))
i += 1
}
}
)
ss.top(20).foreach(println)
Upvotes: 2
Views: 1266
Reputation: 1
I tried this and it works for the return type as mentioned.
val ss=mapfile.map(x=>{case (key, value) => value.sorted.map((key, _))}.groupByKey().map(x=>(x._1,x._2.toSeq))
ss.top(20).foreach(println)
Note: ss is of type::: org.apache.spark.rdd.RDD[(java.lang.String, Seq[java.lang.String])]
Upvotes: 0
Reputation: 20836
I try to convert your codes as follows:
val ss = mapfile.flatMap {
case (key, value) => value.sorted.map((key, _))
}
ss.top(20).foreach(println)
Is it what you want?
Upvotes: 3