aName
aName

Reputation: 3043

how to read and write (update) the same file using spark (scala)

I want to update a CSV file depending on some condition, for that I read the file, made all the needed update, however when I tried to write it I'm getting a FileNotFoundException.

I think that it is due to the writing process, because when I access the path (where the input/output file were located) I find it empty.

Is there a better way to update a file? And if not, how can I resolve the FileNotFoundException error?

Upvotes: 1

Views: 2036

Answers (1)

Raphael Roth
Raphael Roth

Reputation: 27373

you can do it either by writing a temporary table/csv or using checkpointing :

This works :

sparkSession.sparkContext.setCheckpointDir("tmp")

ss.read.csv("test.csv") // read existing csv
  .withColumn("test",lit(1)) // modify
  .checkpoint(eager = true) // checkpoint, write to disk
  .write.mode("overwrite") 
  .csv("test.csv") // write to same location

Upvotes: 4

Related Questions