Robert Niestroj
Robert Niestroj

Reputation: 16151

SQL find rows in groups where a column has a null and a non-null value

The Data

   row    ID YEAR  PROD STA                    DATE
    01    01 2011 APPLE NEW 2011-11-18 00:00:00.000
    02    01 2011 APPLE NEW 2011-11-18 00:00:00.000
    03    01 2013 APPLE OLD NULL
    04    01 2013 APPLE OLD NULL
    05    02 2013 APPLE OLD 2014-04-08 00:00:00.000
    06    02 2013 APPLE OLD 2014-04-08 00:00:00.000
    07    02 2013 APPLE OLD 2014-11-17 10:50:14.113
    08    02 2013 APPLE OLD 2014-11-17 10:46:04.947
    09    02 2013 MELON OLD 2014-11-17 11:01:19.657
    10    02 2013 MELON OLD 2014-11-17 11:19:35.547
    11    02 2013 MELON OLD NULL
    12    02 2013 MELON OLD 2014-11-21 10:32:36.017
    13    03 2006 APPLE NEW 2007-04-11 00:00:00.000
    14    03 2006 APPLE NEW 2007-04-11 00:00:00.000
    15    04 2004 APPLE OTH 2004-09-27 00:00:00.000
    16    04 2004 APPLE OTH NULL

ROW is not a column in the table. Is just to show which records i want.

The question

I need to find rows where a group consisting of (ID, YEAR, PROD, STA) has at least one NULL DATE and a non-NULL DATE.

Expected result

From the above dataset this would be rows 9 to 12 and 15 to 16

Im sitting in front od SSMS and have no idea how to get this. Thinking about group by and exists but really no idea.

Upvotes: 2

Views: 1979

Answers (1)

Giorgos Betsos
Giorgos Betsos

Reputation: 72225

You can use COUNT ... OVER:

SELECT ID, YEAR, PROD, STA, [DATE]
FROM (
  SELECT ID, YEAR, PROD, STA, [DATE],
         COUNT(IIF([DATE] IS NULL, 1, NULL)) OVER 
              (PARTITION BY ID, YEAR, PROD, STA) AS cnt_nulls,
         COUNT(IIF([DATE] IS NOT NULL, 1, NULL)) OVER
               (PARTITION BY ID, YEAR, PROD, STA) AS cnt_not_nulls
  FROM mytable) AS t
WHERE t.cnt_nulls > 0 AND t.cnt_not_nulls > 0

The window version of COUNT is applied twice over ID, YEAR, PROD, STA partitions of data: it returns for every row the population of the current partition. The count is conditionally performed:

  • the first COUNT counts the number of NULL [Date] values within the partition
  • the second COUNT counts the number of NOT NULL [Date] values within the partition.

The outer query checks for partitions having a count of at least one for both of the two COUNT functions of the inner query.

Upvotes: 4

Related Questions