samba
samba

Reputation: 3111

Scala - How to append a column to a DataFrame preserving the original column name?

I have a basic DataFrame containing all the data and several derivative DataFrames that I've been subsequently creating from the basic DF making grouping, joins etc.

Every time I want to append a column to the last DataFrame containing the most relevant data I have to do something like this:

val theMostRelevantFinalDf = olderDF.withColumn("new_date_", to_utc_timestamp(unix_timestamp(col("new_date"))
  .cast(TimestampType), "UTC").cast(StringType)).drop($"new_date")

As you may see I have to change the original column name to new_date_

But I want the column name to remain the same. However if I don't change the name the column gets dropped. So renaming is just a not too pretty workaround.

How can I preserve the original column name when appending the column?

Upvotes: 0

Views: 75

Answers (1)

Emiliano Martinez
Emiliano Martinez

Reputation: 4133

As far as I know you can not create two columns with the same name in a DataFrame transformation. I rename the new column to the older´s name like

val theMostRelevantFinalDf = olderDF.withColumn("new_date_", to_utc_timestamp(unix_timestamp(col("new_date"))
  .cast(TimestampType), "UTC").cast(StringType)).drop($"new_date").withColumnRenamed("new_date_", "new_date")

Upvotes: 1

Related Questions