Reputation: 75
I have delta table enabled liquid clustering. I wonder is there any way to modify a column of the delta table with pyspark
. I tried the following approach but didn't work.
DeltaTable.createOrReplace() \
.tableName("SAMPLE") \
.addColumn("NAME", dataType = StringType()) \
.addColumn("TIMESTAMP", dataType = TimestampType()) \
.addColumn("ZONE_ID", dataType = StringType()) \
.addColumn("VALUE", dataType = IntegerType()) \
.clusterBy("TIMESTAMP") \
.execute()
spark.table("default.SAMPLE") \
.withColumn("NEW_COL", lit('new metric')) \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("SAMPLE")
spark.sql("DESCRIBE DETAIL SAMPLE;").show()
clusteringColumns
became empty.
...|clusteringColumns|...
...| []|...
Upvotes: 0
Views: 28