Puneeth Kumar
Puneeth Kumar

Reputation: 171

Adding a constant value to columns in a Spark dataframe

I have a Spark data frame which will be like below

id person age
1  naveen 24

I want add a constant "del" to each column value except the last column in the dataframe like below,

id   person   age
1del naveendel 24

Can someone assist me how to implement this in Spark df using Scala

Upvotes: 2

Views: 9729

Answers (1)

Tzach Zohar
Tzach Zohar

Reputation: 37852

You can use the lit and concat functions:

import org.apache.spark.sql.functions._

// add suffix to all but last column (would work for any number of cols):
val colsWithSuffix = df.columns.dropRight(1).map(c => concat(col(c), lit("del")) as c)
def result = df.select(colsWithSuffix :+ $"age": _*)

result.show()
// +----+---------+---+
// |id  |person   |age|
// +----+---------+---+
// |1del|naveendel|24 |
// +----+---------+---+

EDIT: to also accommodate null values, you can wrap the column with coalesce before appending the suffix - replace the like calculating colsWithSuffix with:

val colsWithSuffix = df.columns.dropRight(1)
  .map(c => concat(coalesce(col(c), lit("")), lit("del")) as c)

Upvotes: 7

Related Questions