Reputation: 115
I have a table with four columns, looking like this:
id,name, key, date
1,'A' ,'x1','2015-11-11'
2,'A' ,'x1','2015-11-11'
3,'B' ,'x2','2015-11-11'
4,'B' ,'x2','2015-11-11'
5,'A' ,'x1','2015-11-12'
6,'A' ,'x1','2015-11-12'
7,'B' ,'x2','2015-11-12'
8,'B' ,'x2','2015-11-12'
9,'D' ,'x3','2015-11-12'
10,'A' ,'x1','2015-12-11'
11,'A' ,'x1','2015-12-11'
12,'B' ,'x2','2015-12-11'
13,'B' ,'x2','2015-12-11'
14,'A' ,'x1','2015-12-12'
15,'A' ,'x1','2015-12-12'
16,'B' ,'x2','2015-12-12'
17,'B' ,'x2','2015-12-12'
18,'D' ,'x3','2015-12-12'
I want to count the number of distinct new key
-s for each date
:
2015-11-11 2 -- (two distinct keys: x1 and x2)
2015-11-12 1 -- (one new key: x3)
2015-12-11 2 -- (two distinct keys: x1 and x2) - (different month 11)
2015-12-12 1 -- (one new key: x3) - (different month 11)
Only distinct in each month.
How can I do this?
Upvotes: 0
Views: 1296
Reputation: 172944
it is exactly as in you previous question(s) - you just need to add extra group/partition by month --> see use of YearMonth field
SELECT DATE, EXACT_COUNT_DISTINCT(key) AS keys
FROM (
SELECT DATE, key, LEAD(DATE) OVER(PARTITION BY key, YearMonth ORDER BY DATE DESC) AS new
FROM (SELECT DATE, LEFT(DATE, 7) AS YearMonth, key FROM YourTable GROUP BY 1, 2, 3)
) WHERE new IS NULL
GROUP BY DATE
ORDER BY DATE
Upvotes: 1
Reputation: 13994
Here is how I tried to solve it:
Extract month from date:
select key, date, regexp_extract(date, r'[\d]+-(\d\d)-\d\d') month from t
Partition by month since we want to start counts from beginning every month, and compute distinct keys from beginning of the month
select date, month, count(distinct key) over (partition by month order by date rows between unbounded preceding and current row) cd from (select key, date, regexp_extract(date, r'[\d]+-(\d\d)-\d\d') month from t)
Pick the total count distincts per month
select date, month, max(cd) cd from ( select date, month, count(distinct key) over (partition by month order by date rows between unbounded preceding and current row) cd from (select key, date, regexp_extract(date, r'[\d]+-(\d\d)-\d\d') month from t)) group by 1, 2
For every date compute number of total unique keys from the beginning of month for previous date:
select date, cd, lag(cd, 1) over (partition by month order by date) prev_cd from ( select date, month, max(cd) cd from ( select date, month, count(distinct key) over (partition by month order by date rows between unbounded preceding and current row) cd from (select key, date, regexp_extract(date, r'[\d]+-(\d\d)-\d\d') month from t)) group by 1, 2)
Subtract previous date from current one - that's the answer:
select date, cd - prev_cd from ( select date, cd, lag(cd, 1) over (partition by month order by date) prev_cd from ( select date, month, max(cd) cd from ( select date, month, count(distinct key) over (partition by month order by date rows between unbounded preceding and current row) cd from (select key, date, regexp_extract(date, r'[\d]+-(\d\d)-\d\d') month from moshap.wd)) group by 1, 2))
Upvotes: 0