Reputation: 494
how to switch in the most effective way names between 2 columns in Delta Lake? Let's assume that I have the following columns:
Address | Name
And I'd like to swap names, to have:
Name | Address
First I was renaming two columns:
spark.read.table(„table”) \
.withColumnRenamed("address", "name1") \
.withColumnRenamed("name", "address1") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table”")
Then I rename already renamed columns into the final one:
spark.read.table("table”") \
.withColumnRenamed("name1", "name") \
.withColumnRenamed("address1", "address") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table”")
Upvotes: 1
Views: 1605
Reputation: 87119
What about just using toDF function on DataFrame that just sets the new names instead of existing:
spark.read.table("table”") \
.toDF("name", "address")
.write....
If you have more columns, then you can change it a bit by using mapping between existing & new name, and generate the correct list of columns:
mapping = {"address":"name", "name":"address"}
df = spark.read.table("table”")
new_cols = [mapping.get(cl, cl) for cl in df.columns]
df.toDF(*new_cols).write....
Upvotes: 1