Nat
Nat

Reputation: 67

How to convert Delta file format to Parquet File only

Delta Lake is the default storage format.I understand how to convert a parquet to Delta.

My question is is there any way to revert it back to parquet.Any options ?

What I need is I want single parquet file while writing .Do not need the extra log file !

Upvotes: 3

Views: 4081

Answers (2)

adrianabreudev
adrianabreudev

Reputation: 11

If you want to go back from Delta to Parquet you need to:

  1. Get rid of old versions of the data.
  2. Clean the delta_log folder.

For 1 you can run the following command VACUUM <table> RETAIN 0 HOURS. But there is a safe check that will prevent you from running vacuums with less than 168 hours (7 days). To avoid this you can set the following spark property: spark.conf.set("spark.databricks.delta.retentionDurationCheck.enabled", "false")

For 2 just delete the folder.

More info: https://delta.io/blog/remove-files-delta-lake-vacuum-command/

Upvotes: 0

Robert Kossendey
Robert Kossendey

Reputation: 6998

If you run vacuum on the table and delete the log folder, you end up with regular parquet files.

Upvotes: 4

Related Questions