samba
samba

Reputation: 3101

Databricks - How to change a partition of an existing Delta table?

I have a table in Databricks delta which is partitioned by transaction_date. I want to change the partition column to view_date. I tried to drop the table and then create it with a new partition column using PARTITIONED BY (view_date).

However my attempt failed since the actual files reside in S3 and even if I drop a hive table the partitions remain the same. Is there any way to change the partition of an existing Delta table? Or the only solution will be to drop the actual data and reload it with a newly indicated partition column?

Upvotes: 21

Views: 42334

Answers (3)

Jorge Tovar
Jorge Tovar

Reputation: 1869

Python solution:

If you need more than one column in the partition

partitionBy(column, column_2, ...)


def change_partition_of(table_name, column):
  df = spark.read.table(table_name)
  df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").partitionBy(column).saveAsTable(table_name)

change_partition_of("i.love_python", "column_a")

Upvotes: 2

Sebastian
Sebastian

Reputation: 388

As Silvio pointed out there is no need to drop the table. In fact the strongly recommended approach by databricks is to replace the table.

https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html#parameters

in spark SQL, This can be done easily by

REPLACE TABLE <tablename>
  USING DELTA
  PARTITIONED BY (view_date)
AS
 SELECT * FROM <tablename>

Modded example from:
https://docs.databricks.com/delta/best-practices.html#replace-the-content-or-schema-of-a-table

Upvotes: 12

Silvio
Silvio

Reputation: 4207

There's actually no need to drop tables or remove files. All you need to do is read the current table, overwrite the contents AND the schema, and change the partition column:

val input = spark.read.table("mytable")

input.write.format("delta")
  .mode("overwrite")
  .option("overwriteSchema", "true")
  .partitionBy("colB") // different column
  .saveAsTable("mytable")

UPDATE: There previously was a bug with time travel and changes in partitioning that has now been fixed.

Upvotes: 40

Related Questions