Reputation:
I have the below table. Is it possible to do a cummulative distinct count? For example, if A1 has 3 distinct values, then the count for it will be 3. Afterwards, check for A1 and A2. If A1 and A2 together have 5 distinct values, 5. Repeat until A1 + A2 ... + An and count the distinct values.
A | V |
---|---|
A1 | V1 |
A1 | V2 |
A1 | V2 |
A2 | V1 |
A2 | V2 |
A2 | V3 |
My expected output would be:
A | C |
---|---|
A1 | 2 |
A2 | 3 |
Upvotes: 0
Views: 70
Reputation: 167981
You can use a partitioned outer join to ensure that all V
values are counted for all A
values and then use the FIRST_VALUE
analytic function to find whether a value exists in the current or preceding A
values for the V
:
SELECT a,
COUNT( DISTINCT fv ) AS c
FROM (
SELECT t.a,
FIRST_VALUE(t.v) IGNORE NULLS OVER (
PARTITION BY v.v
ORDER BY t.a
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS fv
FROM ( SELECT DISTINCT v FROM table_name ) v
LEFT OUTER JOIN table_name t
PARTITION BY ( t.a )
ON ( t.v = v.v )
)
GROUP BY a
ORDER BY a
Which, for the sample data:
CREATE TABLE table_name ( A, V ) AS
SELECT 'A1', 'V1' FROM DUAL UNION ALL
SELECT 'A1', 'V2' FROM DUAL UNION ALL
SELECT 'A1', 'V3' FROM DUAL UNION ALL
SELECT 'A2', 'V1' FROM DUAL UNION ALL
SELECT 'A2', 'V3' FROM DUAL UNION ALL
SELECT 'A2', 'V4' FROM DUAL UNION ALL
SELECT 'A3', 'V2' FROM DUAL UNION ALL
SELECT 'A3', 'V3' FROM DUAL UNION ALL
SELECT 'A4', 'V1' FROM DUAL UNION ALL
SELECT 'A4', 'V5' FROM DUAL;
Outputs:
A C A1 3 A2 4 A3 4 A4 5
db<>fiddle here
Upvotes: 0
Reputation: 164099
You can use ROW_NUMBER()
window function to find the 1st occurrence of each V
and then COUNT()
window function to count only these 1st occurrences:
SELECT DISTINCT A,
COUNT(CASE WHEN rn = 1 THEN 1 END) OVER (ORDER BY A) C
FROM (
SELECT A, ROW_NUMBER() OVER (PARTITION BY V ORDER BY A) rn
FROM tablename
) t
ORDER BY A
See the demo.
Upvotes: 0
Reputation: 1269873
This answers the original version of the question.
You can aggregate twice . . . once to keep the first occurrence of v
and the second to aggregate again:
select a, count(*) as new_cs
from (select v, min(a) as a
from t
group by v
) v
group by a;
Note: The above only shows a
s that have new values. If you want all a
, then window functions are a better approach:
select a, sum(case when seqnum = 1 then 1 else 0 end) as c
from (select t.*, row_number() over (partition by v order by a) as seqnum
from t
) t
group by a
order by a;
Here is a db<>fiddle.
Upvotes: 2