M_Gandhi
M_Gandhi

Reputation: 108

Scala: Getting error - mapPartitionsWithIndex is not a member of org.apache.spark.rdd.RDD[Int]

I am a beginner in Apache Spark and Scala. I'm getting an error in while executing below code:

val z = sc.parallelize(List(1,2,3,4,5,6),2)

//print the content with partition labels
def myfunc(index: Int, iter: Iterator[(int)] ): Iterator[String] = {
    iter.toList.map( x => "[Part Id: " + index + " ,val:" + x + "]").iterator
}

z.mapPartitionsWithIndex(myfunc).collect
error : value mapPartitionsWithIndex is not a member of org.apache.spark.rdd.RDD[Int]

I coudn't understand what is wrong in the code? Can any one explain? Thanks in advance.

Upvotes: 0

Views: 272

Answers (1)

lxy
lxy

Reputation: 459

It should be Iterator[Int] not Iterator[(Int)].

val z = sc.parallelize(List(1, 2, 3, 4, 5))

def func(index: Int, iter: Iterator[Int]): Iterator[String] = {
  iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}

z.mapPartitionsWithIndex(func).collect()

Or just as follows:

z.mapPartitionsWithIndex{ case (index, iter) =>
    iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}.collect()

Upvotes: 1

Related Questions