Inbar Rose
Inbar Rose

Reputation: 43447

pyspark disable logging to STDOUT

I have been using PySpark and have a problem with the logging. Logs from the Spark module are piped to STDOUT and I have no control over that from Python.

For example, logs such as this one are being piped to STDOUT instead of STDERR:

2018-03-12 09:50:10 WARN Utils:66 - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.

Spark is not installed in the environment, only Python and Pyspark.

How do I:

A. Redirect all logs to STDERR

OR

B. If that is not possible, disable the logs.


Things I have tried:

  1. I have tried to use the pyspark.SparkConf() but nothing I configure there seems to work.
  2. I have tried creating SparkEnv.conf and setting the SPARK_CONF_DIR to match just to check if I could at least disable the example log above, to no avail.
  3. I have tried looking at the documentation but no indication of how to accomplish what I am trying.

Upvotes: 2

Views: 2171

Answers (1)

JordiSilv
JordiSilv

Reputation: 36

You can set Log Level to ERROR, so it will only show ERROR logs:

sc.setLogLevel("ERROR")  # sc is a SparkContext() object from the pyspark lib

But if you want to disable all PySpark logs you can do this:

sc.setLogLevel("OFF")

Check this Stack Thread

Upvotes: 1

Related Questions