Lisa
Lisa

Reputation: 49

Scala Spark - overwrite parquet file failed to delete file or dir

I'm trying to create parquet files for several days locally. The first time I run the code, everything works fine. The second time it fails to delete a file. The third time it fails to delete another file. It's totally random which file can not be deleted.

The reason I need this to work is because I want to create parquet files everyday for the last seven days. So the parquet files that are already there should be overwritten with the updated data.

I use Project SDK 1.8, Scala version 2.11.8 and Spark version 2.0.2.

After running that line of code the second time:

newDF.repartition(1).write.mode(SaveMode.Overwrite).parquet(
    OutputFilePath + "/day=" + DateOfData)

this error occurs:

WARN FileUtil: 
Failed to delete file or dir [C:\Users\...\day=2018-07-15\._SUCCESS.crc]: 
it still exists.
Exception in thread "main" java.io.IOException: 
Unable to clear output directory file:/C:/Users/.../day=2018-07-15 
prior to writing to it
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:91)

After the third time:

WARN FileUtil: Failed to delete file or dir 
[C:\Users\day=2018-07-20\part-r-00000-8d1a2bde-c39a-47b2-81bb-decdef8ea2f9.snappy.parquet]: it still exists.
Exception in thread "main" java.io.IOException: Unable to clear output directory file:/C:/Users/day=2018-07-20 prior to writing to it
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:91)

As you see it's another file than the second time running the code. And so on.. After deleting the files manually all parquet files can be created.

Does somebody know that issue and how to fix it?

Edit: It's always a crc-file that can't be deleted.

Upvotes: 3

Views: 9212

Answers (3)

a.moussa
a.moussa

Reputation: 3277

this problem occurs when you open the destination directory in windows. You just need to close the directory.

Upvotes: 1

Lisa
Lisa

Reputation: 49

Thanks for your answers. :) The solution is not to write in the Users directory. There seems to be a permission problem. So I created a new folder in the C: directory and it works perfect.

Upvotes: 1

DW.
DW.

Reputation: 464

Perhaps another Windows process has a lock on the file so it can't be deleted.

Upvotes: 0

Related Questions