Mohit Bansal
Mohit Bansal

Reputation: 131

write.df failing in sparkr

I am trying to write a SparkDataFrame using SparkR.

write.df(spark_df,"/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/","csv")

But getting the following error-

InsertIntoHadoopFsRelationCommand: Aborting job.
java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000112/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet; isDirectory=false; length=331279; replication=1; blocksize=33554432; modification_time=1475566611000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:371)

In addition, also getting the following error-

WARN FileUtil: Failed to delete file or dir [/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000110/.part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc]: it still exists.

Thanks in advance for your valuable insight.

Upvotes: 0

Views: 417

Answers (2)

Mohit Bansal
Mohit Bansal

Reputation: 131

Got it solved by using root user, initially Spark was trying to write as root but while deleting temp file it was using logged in user, changed logged in user to root and got it solved

Upvotes: 1

Sagar Shah
Sagar Shah

Reputation: 118

The checksum file is not deleted properly. Can you try renaming checksum (crc) file and re-execute.

cd /mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/__temporary/0/task_201610040736_0200_m_000110/

mv .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc_backup

Upvotes: 0

Related Questions