Nick
Nick

Reputation: 2938

Why does mapPartitions print nothing to stdout?

I have this code in scala

object SimpleApp {

  def myf(x: Iterator[(String, Int)]): Iterator[(String, Int)] = {
    while (x.hasNext) {
     println(x.next)
    }
    x
  }

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val tx1 = sc.textFile("/home/paourissi/Desktop/MyProject/data/testfile1.txt")
    val file1 = tx1.flatMap(line => line.split(" ")).map(word => (word, 1))
    val s = file1.mapPartitions(x => myf(x))
  }
}

I am trying to figure out why it doesn't print anything on the output. I run this on a local machine and not on a cluster.

Upvotes: 3

Views: 1950

Answers (2)

lev
lev

Reputation: 4127

mapPartitions is a transformation, and thus lazy

If you will add an action in the end, the whole expression will be evaluated. Try adding s.count in the end.

Upvotes: 4

Greg
Greg

Reputation: 589

You only have transformations, no actions. Spark will not execute until an action is called. Add this line to print out the top 10 of your results.

s.take(10).foreach(println)

Upvotes: 6

Related Questions