blue-sky
blue-sky

Reputation: 53826

Apache spark output messages are prepended by [error]

This word count works as expected :

System.setProperty("hadoop.home.dir", "H:\\winutils");

val sparkConf = new SparkConf().setAppName("GroupBy Test").setMaster("local[1]")
val sc = new SparkContext(sparkConf)

def main(args: Array[String]) {
  val text_file = sc.textFile("h:\\data\\small.txt")

  val counts = text_file.flatMap(line => line.split(" "))
      .map(word => (word, 1))
      .reduceByKey(_ + _)

  counts.foreach(println);
}

All output messages are prepended by [error] example :

[error] 16/03/17 12:13:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
[error] 16/03/17 12:13:58 INFO NettyBlockTransferService: Server created on 55715
[error] 16/03/17 12:13:58 INFO BlockManagerMaster: Trying to register BlockManager
[error] 16/03/17 12:13:58 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55715 with 1140.4 MB RAM, BlockManage
[error] 16/03/17 12:13:58 INFO BlockManagerMaster: Registered BlockManager

I can prevent these error messaged being displayed using :

import org.apache.log4j.Logger
import org.apache.log4j.Level

Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("akka").setLevel(Level.OFF)

But this does not fix the issue.

[error] should not be displayed as these are not error messages but are info :

[error] 16/03/17 12:13:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
[error] 16/03/17 12:13:58 INFO NettyBlockTransferService: Server created on 55715
[error] 16/03/17 12:13:58 INFO BlockManagerMaster: Trying to register BlockManager
[error] 16/03/17 12:13:58 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55715 with 1140.4 MB RAM, BlockManage
[error] 16/03/17 12:13:58 INFO BlockManagerMaster: Registered BlockManager

Update :

Why are [error] messages being displayed as they are not errors ?

Upvotes: 1

Views: 346

Answers (1)

Mateusz Dymczyk
Mateusz Dymczyk

Reputation: 15141

Those are not Spark labels but sbt ones. In the default log4j config file of Spark you can find:

log4j.appender.console.target=System.err

So by default it will print to stderr in the console.

You probably are setting fork to true in your run config somewhere. When doing so everything that is printed to stderr in sbt is prepended with [error].

You should be able to control it with the OutputStrategy.

Upvotes: 4

Related Questions