Reputation: 4243
I have a dataset like below:
Col_A Col_B Col_C Col_D Metric_1
10/1/19 A NULL Z 5
10/1/19 A AB Z 10
10/1/19 A Z 12
10/1/19 B AC J 7
10/1/19 B AB J 12
10/1/19 B J 13
How do I get the count of Col_C where it is not NULL or Blank ''? My dataset contains user level data and I want to pretty much get the count where NULL or Blank is not included.
Col_A Col_B Col_C Col_D Metric_1 COUNT_COL_C
10/1/19 A NULL Z 5 1
10/1/19 A AB Z 10 1
10/1/19 A Z 12 1
10/1/19 B AC J 7 2
10/1/19 B AB J 12 2
10/1/19 B J 13 2
My attempt looks like this, but it is saying the result is 3 for both instead of 1 and 2:
Dataset_Result AS (SELECT *,
CASE WHEN COL_C IS NOT NULL AND COL_C != '' THEN COUNT(*) OVER (PARTITION BY COL_A,COL_B, COL_D) END
COUNT_COL_C
FROM ORIGINAL_DATASET)
SELECT * FROM Dataset_Result;
Upvotes: 0
Views: 40
Reputation: 44871
You can move the conditional inside the window function:
;WITH Dataset_Result AS (
SELECT *,
SUM(CASE WHEN COL_C IS NOT NULL AND COL_C != '' THEN 1 ELSE 0 END) OVER (PARTITION BY COL_A,COL_B, COL_D)
COUNT_COL_C
FROM ORIGINAL_DATASET)
SELECT * FROM Dataset_Result;
See http://www.sqlfiddle.com/#!18/cb8e1/2 for an example.
Upvotes: 1
Reputation: 463
Just doing a normal count with a group by and no CTE or window functions should work for what you say you need.
SELECT Col_A, Col_B, Col_C,Col_D, Metric_1, COUNT(*)
FROM ORIGINAL_DATASET
where COL_C IS NOT NULL AND COL_C != ''
group by Col_A, Col_B, Col_C,Col_D, Metric_1
UNION
SELECT Col_A, Col_B, Col_C,Col_D, Metric_1, COUNT(*)
FROM ORIGINAL_DATASET
where COL_C IS NULL AND COL_C = ''
group by Col_A, Col_B, Col_C,Col_D, Metric_1
EDIT1: Better approach
SELECT
COUNT(CASE WHEN COL_C IS NOT NULL AND COL_C != '' THEN 1 ELSE NULL END) AS COUNT_COL_C
FROM ORIGINAL_DATASET
Upvotes: 1