Reputation: 3177
I am using PySpark to run some simulations with different datasets and I'd like to save all the console output (INFOS, WARNS, etc...) to a text file in an on-the-fly fashion, that is by declaring inside the code the text file that will contain the log output. The code will simply run some operations on an input dataset and I'm planning to run the code using spark-sumbit
.
This will allow me to save separate logs for separate simulations, the idea behind it is to match the log filename with the input dataset name.
Is that possible without changing confs
and other Spark files?
Upvotes: 2
Views: 3895
Reputation: 4375
If you are using yarn-cluster you could get the loggings from,
yarn logs -applicationId <application ID>
If its local or client you could do,
spark-submit myapp.py 2> mylogfile
Upvotes: 1