Save PySpark log to text file

Question

I am using PySpark to run some simulations with different datasets and I'd like to save all the console output (INFOS, WARNS, etc...) to a text file in an on-the-fly fashion, that is by declaring inside the code the text file that will contain the log output. The code will simply run some operations on an input dataset and I'm planning to run the code using spark-sumbit.

This will allow me to save separate logs for separate simulations, the idea behind it is to match the log filename with the input dataset name.

Is that possible without changing confs and other Spark files?

WoodChopper · Accepted Answer

If you are using yarn-cluster you could get the loggings from,

yarn logs -applicationId

If its local or client you could do,

spark-submit myapp.py 2> mylogfile

Save PySpark log to text file

Answers (1)

Related Questions