Dikshit Rajkhowa
Dikshit Rajkhowa

Reputation: 111

Unable to log from Spark Streaming

I am trying to log output of spark streaming as shown in the code below

dStream.foreachRDD { rdd =>

  if (rdd.count() > 0) {
    @transient lazy val log = Logger.getLogger(getClass.getName)  
    log.info("Process Starting")

      rdd.foreach { item =>

     log.info("Output :: "+item._1 + "," + item._2 + "," + System.currentTimeMillis())
    }
 }

The code is executed on a yarn cluster using the following command

./bin/spark-submit --class "StreamingApp" --files file:/home/user/log4j.properties  --conf  "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/home/user/log4j.properties" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/home/user/log4j.properties" --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 /home/user/Abc.jar

When i view the logs from the yarn cluster, I can find the logs written before foreach i.e. log.info("Process Starting") but the logs inside foreach are not printing.

I had also tried creating a separate serializable class as below

    object LoggerObj extends Serializable{

        @transient lazy val log = Logger.getLogger(getClass.getName)
   }

and using the same inside foreach as follows

dStream.foreachRDD { rdd =>

  if (rdd.count() > 0) {

    LoggerObj.log.info("Process Starting")

      rdd.foreach { item =>

     LoggerObj.log.info("Output :: "+item._1 + "," + item._2 + "," + System.currentTimeMillis())
    }
 }

but still the same issue, only the logs outside of foreach are being printed.

The log4j.properties is given below

log4j.rootLogger=INFO,stdout,FILE
log4j.rootCategory=INFO,FILE
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.FILE=org.apache.log4j.RollingFileAppender
log4j.appender.FILE.File=/tmp/Rt.log
log4j.appender.FILE.ImmediateFlush=true
log4j.appender.FILE.Threshold=debug
log4j.appender.FILE.Append=true
log4j.appender.FILE.MaxFileSize=500MB
log4j.appender.FILE.MaxBackupIndex=10
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.logger.Holder=INFO,FILE

Upvotes: 5

Views: 2070

Answers (1)

Dikshit Rajkhowa
Dikshit Rajkhowa

Reputation: 111

I was able to fix it by putting "log4j.properties" file under each worker nodes.

Upvotes: 1

Related Questions