Ana
Ana

Reputation: 103

SQL/Spark: get condition with group and null values

I would need to get an if statement that groups by a date field (named VALIDTO) and returns just one null value if there are any nulls in the group or returns the maximun in case there are not nulls in the group. This should work but it is very tricky as MAX, MIN and COUNT ignore null values:

test = sql_context.sql(""" select field1, field2, 
                                  if(MIN(VALIDTO) is null, null, MAX(VALIDTO)) as VALIDTO
                                  from test_table
                                """)

Many thanks in advance!

Upvotes: 0

Views: 578

Answers (2)

mck
mck

Reputation: 42332

You can check whether there are nulls by getting the maximum of VALIDTO is null in that group:

select 
    field1, field2, 
    case when max(VALIDTO is null) = False then max(VALIDTO) end as VALIDTO
from test_table
group by field1, field2

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269443

Hmmm . . . I think you want an aggregation function on validto to return NULL if any values are NULL. If so, you can use case:

(case when count(*) = count(validTo) then max(validTo)
 end)

Upvotes: 1

Related Questions