Reputation: 31
I have to concatenate multiple columns with a condition in spark Scala, but it doesn't work with the "if". I have following DataFrame:
table:
+---+----+----+
| a| b| c|
+---+----+----+
| 0| 1| 1|
| 1| 0| 0|
| 1| 1| 0|
+---+----+----+
table.withColumn("concat", concat_ws(", ", (if($"a"===1){lit("D")} else{null}),
(if($"b"===1){lit("E")} else{null}),
(if($"c"===1){lit("F")} else{null})))
Below is the final required result.
+---+----+----+------+
| a| b| c|concat|
+---+----+----+------+
| 0| 1| 1| E, F|
| 1| 0| 0| D|
| 1| 1| 0| D, E|
+---+----+----+------+
I wouldn't create other columns like this:
val ftable = (table.withColumn("D", when ($"a"===1, lit("D")))
.withColumn("E", when ($"b"===1, lit("E")))
.withColumn("F", when ($"c"===1, lit("F"))))
val columnselection = ftable.select($"D", $"E" , $"F" )
val selection = columnselection.columns.map(col)
val animaliCol = ftable.select(ftable.col("*"), concat_ws(", ", selection : _*).as("concat"))
Upvotes: 1
Views: 1450
Reputation: 10382
Used concat_ws
, when
& otherwise
functions.
scala> df.show(false)
+---+---+---+
|a |b |c |
+---+---+---+
|0 |1 |1 |
|1 |0 |0 |
|1 |1 |0 |
+---+---+---+
Creating required columns & its default values.
scala> val expressions = concat_ws(", ",Seq(("a","D"),("b","E"),("c","F")).map(c => when(col(c._1) === 1, lit(c._2)).otherwise(null)):_*) // Create column names & its default values expressions.
expressions: org.apache.spark.sql.Column = concat_ws(, , CASE WHEN (a = 1) THEN D ELSE NULL END, CASE WHEN (b = 1) THEN E ELSE NULL END, CASE WHEN (c = 1) THEN F ELSE NULL END)
Final output
scala> df.withColumn("concat",expressions).show(false)
+---+---+---+------+
|a |b |c |concat|
+---+---+---+------+
|0 |1 |1 |E, F |
|1 |0 |0 |D |
|1 |1 |0 |D, E |
+---+---+---+------+
Upvotes: 0
Reputation: 2451
You should replace if
with when
and otherwise
.
import org.apache.spark.sql.functions._
table.withColumn("concat", concat_ws(", ",
when($"a"===1,lit("D")).otherwise(null),
when($"b"===1,lit("E")).otherwise(null),
when($"c"===1,lit("F")).otherwise(null)
)).show()
output:
+---+---+---+------+
| a| b| c|concat|
+---+---+---+------+
| 0| 1| 1| E, F|
| 1| 0| 0| D|
| 1| 1| 0| D, E|
+---+---+---+------+
Upvotes: 1