Reputation: 967
When my spark application fails, it logs a very generic message to the console. In order to see the detailed message, that reveals the true error, I have to go to the Spark History Server and view the stdout logs for my executor. Does anyone know how I can get additional details to appear in the console? I have been looking at a few links that point to the log4j properties file but reviewing the file I would think it is already setup correctly:
# Set everything to be logged to the console
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
A few additional details:
Upvotes: 3
Views: 6818
Reputation: 499
For the log4j.properties
file to work as expected, following needs to be added to spark-submit
(assuming log4j.properties
is in classpath) :
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
But most importantly, you need to make sure that you doing spark-submit
in yarn client mode
, else your driver program will be launched in one of the nodes on your cluster and you will not see its logs on console.
For checking logs on doing spark-submit
on yarn cluster mode
use this (requires yarn.log-aggregation-enable=true
in yarn-site.xml
):
yarn logs -applicationId <applicationId>
Upvotes: 1