Reputation: 43447
I have been using PySpark
and have a problem with the logging. Logs from the Spark
module are piped to STDOUT and I have no control over that from Python
.
For example, logs such as this one are being piped to STDOUT instead of STDERR:
2018-03-12 09:50:10 WARN Utils:66 - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
Spark
is not installed in the environment, only Python
and Pyspark
.
How do I:
A. Redirect all logs to STDERR
OR
B. If that is not possible, disable the logs.
Things I have tried:
pyspark.SparkConf()
but nothing I configure there seems to work. SparkEnv.conf
and setting the SPARK_CONF_DIR
to match just to check if I could at least disable the example log above, to no avail.Upvotes: 2
Views: 2171
Reputation: 36
You can set Log Level to ERROR, so it will only show ERROR logs:
sc.setLogLevel("ERROR") # sc is a SparkContext() object from the pyspark lib
But if you want to disable all PySpark logs you can do this:
sc.setLogLevel("OFF")
Check this Stack Thread
Upvotes: 1