Reputation: 3827
I have a csv file that looks like the following
name, state, a, b, c, d, ..., x
Jon, NY, 1, 4, 6, 2, 6
Eric, CA, 5, 3, 1, 5, 6
Chris,LA, 4, 4, 3, 1, 5
and I want the following result (one column with concat fields (inkl header name))
concate-fields
"name=Jon, state=NY, a=1, b=4, c=6, d= 2, ... x=6"
"name=Eric, state=CA, a=5, b=3, c=1, d= 5, ... x=6"
"name=Chris, state=LA, a=4, b=4, c=3, d= 1, ... x=5"
There can be many headers from a...>x so these should be appended in a generic way
I now have
import org.apache.spark.sql.functions.{concat, lit}
val lp = sample.select(concat(lit("name), $"name", lit(",state="), $"state")
display(lp)
But I have trouble adding the same for column a->x (as this needs to be done in a generic way)
Upvotes: 2
Views: 176
Reputation: 5487
You can dynamically create SQL expression to concat columns by calling map
method on df.columns()
as shown below.
val df = // Read CSV
df.withColumn("concate-fields", expr(s"concat(${df.columns.map(col=>s"'$col=', nvl($col,'null'),','").mkString("").dropRight(4)})"))
.withColumn("concate-fields", concat(lit("\""),col("concate-fields"),lit("\"")))
Upvotes: 1