How to change partition columns in delta live tables?

Question

I first setup a delta live tables using Python as follow

@dlt.table
def transaction():
  return (
    spark
    .readStream
    .format("cloudFiles")
    .schema(transaction_schema)
    .option("cloudFiles.format", "parquet")
    .load(path)
  )

And I wrote the delta live table to target database test

{
    "id": ,
    "clusters": [
        {
            "label": "default",
            "autoscale": {
                "min_workers": 1,
                "max_workers": 5
            }
        }
    ],
    "development": true,
    "continuous": false,
    "edition": "core",
    "photon": false,
    "libraries": [
        {
            "notebook": {
                "path": 
            }
        }
    ],
    "name": "dev pipeline",
    "storage": ,
    "target": "test"
}

Everything worked as expected in the first trial.

After a while, I noticed that I forgot to add a partition column to the table, so I dropped the table in test by DROP TABLE test.transaction, and updated the notebook to

@dlt.table(
  partition_cols=["partition"],
)
def transaction():
  return (
    spark
    .readStream
    .format("cloudFiles")
    .schema(transaction_schema)
    .option("cloudFiles.format", "parquet")
    .load(path)
    .withColumn("partition", F.to_date("timestamp"))
  )

However, when I ran the pipeline again, I got an error

org.apache.spark.sql.AnalysisException: Cannot change partition columns for table transaction.
Current: 
Requested: partition

Looks like I can't change the partition column by only dropping the target table.

What is the proper way to change partition columns in delta live tables?

Alex Ott · Accepted Answer

If you have changed the partitioning schema, then instead of starting pipeline using Start button, you need to select "Full refresh" option from the dropdown of the Start button:

How to change partition columns in delta live tables?

Answers (1)

Related Questions