Reputation: 108
I am a beginner in Apache Spark and Scala. I'm getting an error in while executing below code:
val z = sc.parallelize(List(1,2,3,4,5,6),2)
//print the content with partition labels
def myfunc(index: Int, iter: Iterator[(int)] ): Iterator[String] = {
iter.toList.map( x => "[Part Id: " + index + " ,val:" + x + "]").iterator
}
z.mapPartitionsWithIndex(myfunc).collect
error : value mapPartitionsWithIndex is not a member of org.apache.spark.rdd.RDD[Int]
I coudn't understand what is wrong in the code? Can any one explain? Thanks in advance.
Upvotes: 0
Views: 272
Reputation: 459
It should be Iterator[Int]
not Iterator[(Int)]
.
val z = sc.parallelize(List(1, 2, 3, 4, 5))
def func(index: Int, iter: Iterator[Int]): Iterator[String] = {
iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}
z.mapPartitionsWithIndex(func).collect()
Or just as follows:
z.mapPartitionsWithIndex{ case (index, iter) =>
iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}.collect()
Upvotes: 1