Reputation: 1
I have list of dataframe column names which I need to concat (list_name=["name","email"]) and I have a dataframe with many columns(df="name","email","address","phone"). Now I need to concat and create a new column for the values of names specified in list.
Expected result: df="name","email","address","phone","nameemail"
List=["name","email"]. But the list is dynamic(it may have n number of values)
df
name | phone | |
---|---|---|
ram | [email protected] | 345897045 |
raj | [email protected] | 658086657 |
expexteddf
name | phone | nameemail | |
---|---|---|---|
ram | [email protected] | 345897045 | [email protected] |
raj | [email protected] | 658086657 | [email protected] |
Upvotes: 0
Views: 221
Reputation: 14277
This should be straightforward using concat
function. You should at least try and show what you did, but this should be way to go:
from pyspark.sql.functions import concat
val concatColumns = ... // List of column names to concatenate
val newColumnName = concatColumns.mkString
expexteddf = df.addColumn(newColumnName, concat(concatColumns: _*))
Upvotes: 1