Reputation: 3111
I have a basic DataFrame containing all the data and several derivative DataFrames that I've been subsequently creating from the basic DF making grouping, joins etc.
Every time I want to append a column to the last DataFrame containing the most relevant data I have to do something like this:
val theMostRelevantFinalDf = olderDF.withColumn("new_date_", to_utc_timestamp(unix_timestamp(col("new_date"))
.cast(TimestampType), "UTC").cast(StringType)).drop($"new_date")
As you may see I have to change the original column name to new_date_
But I want the column name to remain the same. However if I don't change the name the column gets dropped. So renaming is just a not too pretty workaround.
How can I preserve the original column name when appending the column?
Upvotes: 0
Views: 75
Reputation: 4133
As far as I know you can not create two columns with the same name in a DataFrame transformation. I rename the new column to the older´s name like
val theMostRelevantFinalDf = olderDF.withColumn("new_date_", to_utc_timestamp(unix_timestamp(col("new_date"))
.cast(TimestampType), "UTC").cast(StringType)).drop($"new_date").withColumnRenamed("new_date_", "new_date")
Upvotes: 1