Babu
Babu

Reputation: 891

Writing into the existing directory Dataframe using partitionBy

In the below code ,I am not able to write the dataframe into an existing directory,It just exits from the spark submit job.Is there a way I can write it to existing directory other than creating a new directory?

Here test is a dataframe

test.repartition(100).write.partitionBy("date").parquet(hdfslocation)

Upvotes: 2

Views: 1434

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41987

You can always write to the existing directory, if the filenames are different in each write. You should find a mechanism to change names of the output files.

If you want to Overwrite existing files in existing directory, then you don't need to change the filenames but simply use mode option as

test.repartition(100).write.mode(SaveMode.Overwrite).partitionBy("date").parquet(hdfslocation)

There are other mode options you can play with : Append, ErrorIfExists, Ignore, valueOf, values

Upvotes: 2

Related Questions