Qing
Qing

Reputation: 143

Spark job keeps having output folder already exists exception

I am running a spark job, and it kept failing with output folder already exists exceptions. I indeed removed the output folder before the job. Looks like the folder is created during the job and it confused other nodes/threads. It happens randomly but not always.

Upvotes: 0

Views: 1515

Answers (2)

morfious902002
morfious902002

Reputation: 918

rdd.write().format("parquet").mode(SaveMode.Overwrite).save("location");

This should solve the issue of file already exists.

Upvotes: 2

Lezzar Walid
Lezzar Walid

Reputation: 148

If you are using a local filesystem path, then be aware that the folder gets created on all workers. So you probably have to delete it from all of them.

Upvotes: 0

Related Questions