diens
diens

Reputation: 659

Spark: function to print RDD[A]

I am writing a function which receives an RDD and a number integer n, and prints the n elements of the received RDD.

The RDD parameter hasn't a predetermined type and, using pattern matching, I wanted to print in a different way depending on the RDD.

For instance, if I have this: myRDD:RDD[(String, Array[String])]. When I call printRddContent(myRDD), I would like to print it in this way (outside a function, this works well):

anRdd.map { case (a, arr) => (a, arr.toList) }.collect().take(n).foreach(println)

And so on, with different patterns.

So far, this is my code:

  def printRddContent[A](anRdd: RDD[A], n: Int) = {  
    anRdd match {
      case r1: RDD[(String, Array[String])] => anRdd.map { case (a, arr) => (a, arr.List) }.take(n).foreach(println)
      case _ => "case clause"
    }
  }

But the .toList shows a message: Cannot resolve symbol toList. I don't understand why this is not working inside the function.

Upvotes: 1

Views: 134

Answers (1)

mahmoud mehdi
mahmoud mehdi

Reputation: 1590

Here's a solution based on the code you provided :

  def printRddContent[A](anRdd: RDD[A], n: Int) = {
    anRdd match {
      case r1: RDD[(String, Array[String])] => r1.asInstanceOf[RDD[(String, Array[String])]].map { case (a, arr) => (a, arr.toList)}.take(n).foreach(println)
      case _ => "case clause"
   }
 }

In this case, it's safe to use asInstanceOf since we have already checked that the RDD corresponds perfectly to the type (via the pattern matching)

Upvotes: 2

Related Questions