Tiffany
Tiffany

Reputation: 273

Spark history server

I'm currently working on Spark in Scala with the IntelliJ IDE. I have access to the web UI when my program is executed but i don't have access to the history server afterwards.

Any help would be very appreciated, i'm using windows by the way. I also have created a log folder at C:/tmp/spark-events and edited the spark-defaults.conf this way :

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              5g
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
-Dspark.eventLog.enabled           true
-Dspark.history.fs.logDirectory    file:///C:/tmp/spark-events
-Dspark.eventLog.dir               file:///C:/tmp/spark-events

but I don't see any logs after execution

UPDATE 1 : After seeing this tutorial https://medium.com/@eyaldahari/how-to-run-spark-history-server-on-windows-52cde350de07 I have now access to the history server. However it is empty and the logs don't exist...

UPDATE 2 : I am very close now, if I launch spark-shell in the console, the log and history work however when i use this programm in my IDE it doesn't write any logs.

Here's my code

object SimpleScalaSpark {
  def main(args: Array[String]) {
    val logFile = "/Users/me/README.md" // Should be some file on your system
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()

      val numAs = logData.filter(line => line.contains("a")).count()
      val numBs = logData.filter(line => line.contains("b")).count()
      println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))


  }
}

Upvotes: 2

Views: 1792

Answers (1)

Tiffany
Tiffany

Reputation: 273

I finally solved the issue by adding the following lines in the code :

conf.set("spark.eventLog.enabled", "true")
conf.set("spark.eventLog.dir", "file:///C:/Users/me/spark/logs")

It's all working perfectly now

Upvotes: 4

Related Questions