Reputation: 1527
I created 2 Pair RDD's in Spark
var pairrdd = sc.parallelize(List((1,2),(3,4),(3,6)))
var pairrdd2 = sc.parallelize(List((3,9)))
I applied the cogroup function
var cogrouped = pairrdd.cogroup(pairrdd2)
The object type for cogroupedrdd looks like below.
cogrouped: org.apache.spark.rdd.RDD[(Int, (Iterable[Int], Iterable[Int]))] = MapPartitionsRDD[801] at cogroup at <console>:60
I am trying to create a function to iterate over these values
def iterateThis((x: Int,(x1:Iterable[Int],x2:Iterable[Int])))={
println(x1.mkString(","))
}
but am I getting below error.
<console>:21: error: identifier expected but '(' found.
def iterateThis((x: Int,(x1:Iterable[Int],x2:Iterable[Int])))={
^
Upvotes: 0
Views: 408
Reputation: 53839
Your argument is of type (Int, (Iterable[Int], Iterable[Int]))
:
def iterateThis(arg: (Int, (Iterable[Int], Iterable[Int]))) = {
val (_, (x1, _)) = arg
println(x1.mkString(","))
}
Upvotes: 1