Reputation: 89
I am using spark scala with DataFrame API, trying to convert the below sql logic
CASE
WHEN col_1 like '%XYZ' OR col_1 like '%ZYX' THEN
CASE WHEN col_2 like '%TTT' THEN 'ABC' ELSE 'BBA' END
WHEN col_1 not like '%XYZ' OR col_1 not like '%ZYX'
CASE WHEN col_2 like '%YYY' THEN BBC' END
END as new_col
How to construct CASE WHEN with multiple like and not like conditions with spark scala dataframe api?
Upvotes: 0
Views: 128
Reputation: 131
Use the expr function and pass the whole case statement in it as below.
import org.apache.spark.sql.functions._
val df=Seq(
("A","01/01/2022",1), ("AXYZ","02/01/2022",1), ("AZYX","03/01/2022",1),("AXYZ","04/01/2022",0), ("AZYX","05/01/2022",0),("AB","06/01/2022",1), ("A","07/01/2022",0) ).toDF("Category", "date", "Indictor")
df.select(col("*"),expr("""CASE WHEN Category like '%XYZ' OR Category like '%ZYX' THEN
CASE WHEN Indictor = 1 THEN 'ABC' ELSE 'BBA' END
WHEN Category not like '%XYZ' OR Category not like '%ZYX' then
CASE WHEN Indictor = 1 THEN 'BBC' ELSE 'BBD' END
END""").alias("new_col")).show()
+--------+----------+--------+-------+
|Category| date|Indictor|new_col|
+--------+----------+--------+-------+
| A|01/01/2022| 1| BBC|
| AXYZ|02/01/2022| 1| ABC|
| AZYX|03/01/2022| 1| ABC|
| AXYZ|04/01/2022| 0| BBA|
| AZYX|05/01/2022| 0| BBA|
| AB|06/01/2022| 1| BBC|
| A|07/01/2022| 0| BBD|
+--------+----------+--------+-------+
Upvotes: 1