Reputation: 131
I am trying to write a SparkDataFrame using SparkR.
write.df(spark_df,"/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/","csv")
But getting the following error-
InsertIntoHadoopFsRelationCommand: Aborting job.
java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000112/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet; isDirectory=false; length=331279; replication=1; blocksize=33554432; modification_time=1475566611000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:371)
In addition, also getting the following error-
WARN FileUtil: Failed to delete file or dir [/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000110/.part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc]: it still exists.
Thanks in advance for your valuable insight.
Upvotes: 0
Views: 417
Reputation: 131
Got it solved by using root user, initially Spark was trying to write as root but while deleting temp file it was using logged in user, changed logged in user to root and got it solved
Upvotes: 1
Reputation: 118
The checksum file is not deleted properly. Can you try renaming checksum (crc) file and re-execute.
cd /mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/__temporary/0/task_201610040736_0200_m_000110/
mv .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc_backup
Upvotes: 0