How to add or delete columns on liquid clustering enabled delta table?

Question

I have delta table enabled liquid clustering. I wonder is there any way to modify a column of the delta table with pyspark. I tried the following approach but didn't work.

DeltaTable.createOrReplace() \
  .tableName("SAMPLE") \
  .addColumn("NAME", dataType = StringType()) \
  .addColumn("TIMESTAMP", dataType = TimestampType()) \
  .addColumn("ZONE_ID", dataType = StringType()) \
  .addColumn("VALUE", dataType = IntegerType()) \
  .clusterBy("TIMESTAMP") \
  .execute()

spark.table("default.SAMPLE") \
  .withColumn("NEW_COL", lit('new metric')) \
  .write \
  .format("delta") \
  .mode("overwrite") \
  .option("overwriteSchema", "true") \
  .saveAsTable("SAMPLE")

spark.sql("DESCRIBE DETAIL SAMPLE;").show()

clusteringColumns became empty.

...|clusteringColumns|...
...|               []|...

Version

pyspark == 3.5.4
delta-spark == 3.3.0

How to add or delete columns on liquid clustering enabled delta table?

Version

Answers (0)

Related Questions