Reputation: 61
I wrote sample spark-scala program for creating list of json elements from dataframe. when i executed with main method it returns empty list but when i executed without object that extends app it returns list that contains records. what is the difference between extends App and main method in scala object
object DfToMap {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")
val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)
}
}
It will return empty list But following program gives me list with records
object DfToMap extends App{
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")
val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)
}
Upvotes: 5
Views: 2177
Reputation: 66
TL;DR Both snippets are not correct Spark programs, but one is just more incorrect than the other.
You've made two mistakes, both explained in the introductory Spark materials.
Due to it's nature Spark doesn't support applications extending App
- Quick Start - Self-Contained Applications
Note that applications should define a main() method instead of extending scala.App. Subclasses of scala.App may not work correctly.
Spark doesn't provide global shared memory therefore modifying global object is a closure is not supported - Spark Programming Guide - Understanding Closures
Upvotes: 5