Reputation: 21
For spark jobs, we are trying to add a logging framework that creates a custom log file on a local FS. In client mode, everything is fine, the files are created on the local FS with the user who launched the spark-submit. However in cluster mode, the local files are created with the user yarn who does not have the permission to write to the local directory...
Is there any solution to write a local file in cluster mode with the user who submited the job without changing the permission to 777 everywhere? Is the cluster mode better in this case (we are on PROD environment), knowing that the job is launched from a node of the cluster (so there is no network issue).
Thank you.
Upvotes: 0
Views: 623
Reputation: 1023
Yes, here is a way: Using shell script to submit spark jobs
We use logger to print all our logs. we always have unique text with the log message eg: log.info("INFO_CUSTOM: Info message"). Once our application is completed we will Yarn logs command and grep for the unique text.
eg. yarn application -list -appStates FINISHED,FAIED,KILLED | grep <application name>
eg. yarn logs -applicationId <application id u got fro step 1> | grep -w "INFO_CUSTOM" >> joblog.log
Upvotes: 0