Ramesh
Ramesh

Reputation: 1593

Convert Spark Dataframes each row as a String with a delimiter between each column value in scala

I want to convert Spark Dataframe each row as a String with a delimiter between each column value.

For example: I have a input dataframe 'df' with 3 columns "firstname","lastname","age", with two records which look like below.

Row1: John Abhraham 21 Row2: Steve Austin 22

I want to create a new dataframe with just one column which has data like below.
Row1: John$Abhraham$21 Row2: Steve$Austin$22

Can anyone please help in doing this.

Upvotes: 2

Views: 11829

Answers (2)

Alec
Alec

Reputation: 32309

I don't have a Spark shell handy, but I think this one liner should do it:

def stringifyRows(df: DataFrame, sep: String): DataFrame 
  = df.map(row => row.mkString(sep)).toDf("myColumnName")

For your example, you would call this as stringifyRows(myDf, "$"). Let me know what the error message is if this doesn't work.

Upvotes: 5

Shankar
Shankar

Reputation: 8957

You can use concat for this.

For example:

df.select(concat($"firstname", lit("$"), $"lastname", lit("$"), "age")).show()

OR

df.withColumn("newColumnName",concat($"firstname", lit("$"), $"lastname", lit("$"), "age")).show()

Upvotes: 1

Related Questions