Sourav Ghosh
Sourav Ghosh

Reputation: 33

Unable to enable INFO logging for pyspark job

Need to enable INFO logging for detail information but only able to capture error and warn.

log4j.rootCategory=INFO, console, server, file log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%p %c{1}: %m%n

# Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

log4j.appender.server=org.apache.log4j.net.SocketAppender log4j.appender.server.Port=4712 log4j.appender.server.RemoteHost= log4j.appender.server.ReconnectionDelay=10000

#log4j.rootLogger=DEBUG, file log4j.appender.file=org.apache.log4j.RollingFileAppender log4j.appender.file.File=/data/sourav/logs/ServiceReminder.log log4j.appender.file.MaxFileSize=10MB log4j.appender.file.MaxBackupIndex=10 log4j.appender.file.Threshold=debug log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %5p %c{7} - %m%n

log4j.logger.org.apache.spark=INFO log4j.logger.org.eclipse.jetty=INFO log4j.logger.com.vmeg.code=${vm.logging.level}

log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

PFB the code snippet.

log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger(__name__)

LOGGER.info("pyspark script testing INFO")
LOGGER.warn("pyspark script testing WARN")
LOGGER.error("pyspark script testing ERROR")

Thanks in advance!

Upvotes: 1

Views: 323

Answers (2)

ottobricks
ottobricks

Reputation: 343

I put a simple solution as a comment. However, there are different applications that log info. So, you could be more granular and set different levels for each. Example:

logManager = sc._jvm.org.apache.log4j.LogManager

logManager.getLogger("org.apche").setLevel(Level.INFO)
logManager.getLogger("org.apache.spark").setLevel(Level.WARN)
logManager.getLogger("org.spark-project").setLevel(Level.ERROR)

Upvotes: 0

Travis Hegner
Travis Hegner

Reputation: 2495

In the scala api, you would have:

spark.sparkContext.setLogLevel("INFO")

somewhere in your job. Perhaps the python api is the same.

Upvotes: 0

Related Questions