Count unique values in a table with group by, with and without where condition

I have a table:

FUND DATE ID POST
ACAT Friday, January 1, 2021 10058 5056
ACAT Friday, January 1, 2021 10058 5056
BCAT Friday, January 1, 2021 32598 5004
ACAT Monday, February 1, 2021 10058 5056
MISS Monday, February 1, 2021 10058 5056
CCAT Monday, February 1, 2021 32598 5004
DCAT Monday, March 1, 2021 10058 5056
ACAT Monday, March 1, 2021 10058 5056
MISS Monday, March 1, 2021 32598 5004
MISS Monday, March 1, 2021 56678 7845
ACAT Monday, March 1, 2021 45459 5056

I need a result set in the following format:

DATE COUNT_UNIQUE_ID_MISS COUNT_UNIQUE_POST_MISS COUNT_UNIQUE_ID_ALL COUNT_UNIQUE_POST_ALL
Friday, January 1, 2021 0 0 2 2
Monday, February 1, 2021 1 1 2 2
Monday, March 1, 2021 2 2 4 3

What I have:

SELECT [DATE]
,COUNT(DISTINCT ID) AS [CUIM]
,COUNT(DISTINCT POST) AS [CUPM]
FROM [TABLE]
WHERE [FUND] = 'MISS'
GROUP BY [DATE]

What I don't have: The logic to get a COUNT DISTINCT for ALL the rows (without WHERE)

What I have tried:

SELECT [DATE]
,COUNT(DISTINCT ID) AS [CUIM]
,COUNT(DISTINCT POST) AS [CUPM]
,(SELECT COUNT(DISTINCT ID) AS X FROM [TABLE]) AS [CUIA]
,(SELECT COUNT(DISTINCT ID) AS X FROM [TABLE]) AS [CUPA]
FROM [TABLE]
WHERE [FUND] = 'MISS'
GROUP BY [DATE]

The problem with this: I get a count unique for the whole table, but its not grouped by date. I cannot put GROUP BY in the subqueries, as it would return multiple values -> error message.

Dummy data to work on:

CREATE TABLE [TABLE] (
    [FUND] varchar(255),
    [DATE] varchar(255),
    [ID] int,
    [POST] int,
);

INSERT INTO [TABLE]
VALUES ('ACAT', 'Friday, January 1, 2021', 10058, 5056),
('ACAT', 'Friday, January 1, 2021', 10058, 5056),
('BCAT', 'Friday, January 1, 2021', 32598, 5004),
('ACAT', 'Monday, February 1, 2021', 10058, 5056),
('MISS', 'Monday, February 1, 2021', 10058, 5056),
('CCAT', 'Monday, February 1, 2021', 32598, 5004),
('DCAT', 'Monday, March 1, 2021', 10058, 5056),
('ACAT', 'Monday, March 1, 2021', 10058, 5056),
('MISS', 'Monday, March 1, 2021', 32598, 5004),
('MISS', 'Monday, March 1, 2021', 56678, 7845),
('ACAT', 'Monday, March 1, 2021', 45459, 5056);

SELECT * FROM [TABLE]

Thank you.

Upvotes: 0

Views: 414

Answers (2)

Stu
Stu

Reputation: 32579

It's not as compact but If you wanted to keep the where clause filtering you can also get the correlation you need from your existing counts by refactoring using outer apply

SELECT
    DATE,
    COUNT(DISTINCT ID) AS CUIM,
    COUNT(DISTINCT POST) AS CUPM,
    MAX(C1.CUIA) AS CUIA,
    MAX(C2.CUPA) AS CUPA
FROM TABLE T
OUTER APPLY (SELECT COUNT(DISTINCT ID) CUIA FROM TABLE T2 WHERE T2.DATE = T.DATE) C1
OUTER APPLY (SELECT COUNT(DISTINCT POST) CUPA FROM TABLE T2 WHERE T2.DATE = T.DATE) C2
WHERE FUND = 'MISS'
GROUP BY DATE;

Upvotes: 0

astentx
astentx

Reputation: 6751

You need conditional aggregation:

SELECT
  [DATE]
  , COUNT(DISTINCT case when [FUND] = 'MISS' then ID end) AS COUNT_UNIQUE_ID_MISS
  , COUNT(DISTINCT case when [FUND] = 'MISS' then POST end) AS COUNT_UNIQUE_POST_MISS
  , COUNT(DISTINCT ID) as COUNT_UNIQUE_ID_ALL
  , COUNT(DISTINCT POST) as COUNT_UNIQUE_POST_ALL
FROM [TABLE]
GROUP BY [DATE]

Upvotes: 1

Related Questions