Amir haroun
Amir haroun

Reputation: 21

Writing to a local FS in cluster mode SPARK

For spark jobs, we are trying to add a logging framework that creates a custom log file on a local FS. In client mode, everything is fine, the files are created on the local FS with the user who launched the spark-submit. However in cluster mode, the local files are created with the user yarn who does not have the permission to write to the local directory...

Is there any solution to write a local file in cluster mode with the user who submited the job without changing the permission to 777 everywhere? Is the cluster mode better in this case (we are on PROD environment), knowing that the job is launched from a node of the cluster (so there is no network issue).

Thank you.

Upvotes: 0

Views: 623

Answers (1)

Sathiyan S
Sathiyan S

Reputation: 1023

Yes, here is a way: Using shell script to submit spark jobs

We use logger to print all our logs. we always have unique text with the log message eg: log.info("INFO_CUSTOM: Info message"). Once our application is completed we will Yarn logs command and grep for the unique text.

  1. Get the application id using yarn command with application name.

eg. yarn application -list -appStates FINISHED,FAIED,KILLED | grep <application name>

  1. Run yarn logs command and grep, redirect it to the file you want.

eg. yarn logs -applicationId <application id u got fro step 1> | grep -w "INFO_CUSTOM" >> joblog.log

Upvotes: 0

Related Questions