Johann Schatzer
Johann Schatzer

Reputation: 11

Count groups of NULL values - partition or window?

  n | g
 ---------
  1 | 1
  2 | NULL
  3 | 1
  4 | 1
  5 | 1
  6 | 1
  7 | NULL
  8 | NULL
  9 | NULL
 10 | 1
 11 | 1
 12 | 1
 13 | 1
 14 | 1
 15 | 1
 16 | 1
 17 | NULL
 18 | 1
 19 | 1
 20 | 1
 21 | NULL
 22 | 1
 23 | 1
 24 | 1
 25 | 1
 26 | NULL
 27 | NULL
 28 | 1
 29 | 1
 30 | NULL
 31 | 1

From the above column g I should get this result:

 x|y
 ---
 1|4
 2|1
 3|1

where

x stands for the count of contiguous NULLs and
y stands for the times a single group of NULLs occurs.

I.e., there is ...
4 groups of only 1 NULL,
1 group of 2 NULLs and
1 group of 3 NULLs

Upvotes: 0

Views: 1522

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656331

Compute a running count of not-null values with a window function to form groups, then 2 two nested counts ...

SELECT x, count(*) AS y
FROM  (
   SELECT grp, count(*) FILTER (WHERE g IS NULL) AS x
   FROM  (
      SELECT g, count(g) OVER (ORDER BY n) AS grp
      FROM   tbl
      ) sub1
   WHERE  g IS NULL
   GROUP  BY grp
   ) sub2
GROUP  BY 1
ORDER  BY 1;

count() only counts not null values.

This includes the preceding row with a not null g in the following group (grp) of NULL values - which has to be removed from the count.

I replaced the HAVING clause I had for that in my initial query with WHERE g IS NULL, like @klin uses in his answer), that's simpler.

Related:

If n is a gapless sequence of integer numbers, you can simplify further:

SELECT x, count(*) AS y
FROM  (
   SELECT grp, count(*) AS x
   FROM  (
      SELECT n - row_number() OVER (ORDER BY n) AS grp
      FROM   tbl
      WHERE  g IS NULL
      ) sub1
   GROUP  BY 1
   ) sub2
GROUP  BY 1
ORDER  BY 1;

Eliminate not null values immediately and deduct the row number from n, thereby arriving at (meaningless) group numbers directly ...


While the only possible value in g is 1, sum() is a smart trick (like @klin provided). But that should be a boolean column then, wouldn't make sense as numeric type. So I assume that's just a simplification of the actual problem in the question.

Upvotes: 1

klin
klin

Reputation: 121524

select x, count(x) y
from (
    select s, count(s) x
    from ( 
        select *, sum(g) over (order by i) as s
        from example
        ) s
    where g isnull
    group by 1
    ) s
group by 1
order by 1;

Test it here.

Upvotes: 0

Related Questions