Reputation: 1106
I have the following data frame in spark
s s_type o o_type
-----------------
s1 ss1 o1 oo1
s2 ss2 o2 oo2
I want to swap the columns
s s_type o o_type
-----------------
o1 oo1 s1 ss1
o2 oo2 s2 ss2
one way is to copy columns [o, o_type]
into temporary columns ['o_temp','o_type_temp']
and then copy the values of [s,s_type]
into [o,o_type]
and finally ['o_temp','o_type_temp']
into [s,s_type]
.
I was wondering if there is a better/more efficient way to do this?
Upvotes: 2
Views: 3515
Reputation: 43534
You can just use select
with pyspark.sql.Column.alias
:
from pyspark.sql.functions import col
df = df.select(
col("o").alias("s"),
col("o_type").alias("s_type"),
col("s").alias("o"),
col("s_type").alias("o_type")
)
For a more generalized solution, you can create a mapping of old name to new name and loop over this in a list comprehension:
# key = old column, value = new column
mapping = {
"o": "s",
"o_type": "s_type",
"s": "o",
"s_type": "o_type"
}
df = df.select(*[col(old).alias(new) for old, new in mapping.items()])
Upvotes: 4