Reputation: 171
I have a Spark data frame which will be like below
id person age
1 naveen 24
I want add a constant "del" to each column value except the last column in the dataframe like below,
id person age
1del naveendel 24
Can someone assist me how to implement this in Spark df using Scala
Upvotes: 2
Views: 9729
Reputation: 37852
You can use the lit
and concat
functions:
import org.apache.spark.sql.functions._
// add suffix to all but last column (would work for any number of cols):
val colsWithSuffix = df.columns.dropRight(1).map(c => concat(col(c), lit("del")) as c)
def result = df.select(colsWithSuffix :+ $"age": _*)
result.show()
// +----+---------+---+
// |id |person |age|
// +----+---------+---+
// |1del|naveendel|24 |
// +----+---------+---+
EDIT: to also accommodate null values, you can wrap the column with coalesce
before appending the suffix - replace the like calculating colsWithSuffix
with:
val colsWithSuffix = df.columns.dropRight(1)
.map(c => concat(coalesce(col(c), lit("")), lit("del")) as c)
Upvotes: 7