Reputation: 891
In the below code ,I am not able to write the dataframe into an existing directory,It just exits from the spark submit job.Is there a way I can write it to existing directory other than creating a new directory?
Here test is a dataframe
test.repartition(100).write.partitionBy("date").parquet(hdfslocation)
Upvotes: 2
Views: 1434
Reputation: 41987
You can always write to the existing directory, if the filenames are different in each write. You should find a mechanism to change names of the output files.
If you want to Overwrite
existing files in existing directory, then you don't need to change the filenames but simply use mode
option as
test.repartition(100).write.mode(SaveMode.Overwrite).partitionBy("date").parquet(hdfslocation)
There are other mode options you can play with : Append, ErrorIfExists, Ignore, valueOf, values
Upvotes: 2