Uday Sagar
Uday Sagar

Reputation: 480

Implementing SQL logic via Dataframes using Spark and Scala

I have three columns (c1, c2, c3) in a Hive table t1. I have MySQL code that checks whether specific columns are null. I have the dataframe from the same table. I would like to implement the same logic via dataframe, df which has three columns, c1, c2, c3.

Here is the SQL-

if(
t1.c1=0 Or IsNull(t1.c1),
if(
IsNull(t1.c2/t1.c3),
1,
t1.c2/t1.c3
),
t1.c1
) AS myalias

I had drafted the following logic in scala using "when" as an alternative to "if" of SQL. I am facing problem writing "Or" logic(bolded below). How can I write the above SQL logic via Spark dataframe using Scala?

val df_withalias = df.withColumn("myalias",when(
  Or((df("c1") == 0), isnull(df("c1"))),
  when(
    (isNull((df("c2") == 0)/df("c3")),
  )
)
)

How can I write the above logic?

Upvotes: 0

Views: 373

Answers (1)

Tzach Zohar
Tzach Zohar

Reputation: 37852

First, you can use Column's || operator to construct logical OR conditions. Also - note that when takes only 2 arguments (condition and value), and if you want to supply an alternative value (to be used if condition isn't met) - you need to use .otherwise:

val df_withalias = df.withColumn("myalias",
  when(df("c1") === 0 || isnull(df("c1")),
    when(isnull(df("c2")/df("c3")), 1).otherwise(df("c2")/df("c3"))
  ).otherwise(df("c1"))
)

Upvotes: 1

Related Questions