Reputation: 934
I need to get the number of user ID's for each month, but they should only be counted for the month if the user's minimum month falls within that month.
So if customer A had a min(day) of 04/18 then for month and year, they would be counted.
My table looks like:
monthyear | id
02/18 A32
04/19 T39
05/19 T39
04/19 Y95
01/18 A32
12/19 I99
11/18 OPT
09/19 TT8
I was doing something like:
SELECT day, id
SUM(CASE WHEN month = min(day) THEN 1 ELSE 0)
FROM testtable
GROUP BY 1
But I'm not sure how to specify that for each user ID, so only user ID = 1, when their min(Day) = day
Goal table to be:
monthyear | count
01/18 1
02/18 0
11/18 1
04/19 2
05/19 0
09/19 1
12/19 1
Upvotes: 0
Views: 34
Reputation: 1269673
Use window functions. Let me assume that your monthyear
is really yearmonth
, so it sorts correctly:
SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY yearmonth) as seqnum
FROM testtable tt
) tt
WHERE seqnum = 1
GROUP BY yearmonth;
If you do have the absurd format of month-year, then you can use string manipulations. These depend on the database, but something like this:
SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY RIGHT(monthyear, 2), LEFT(monthyear, 2) as seqnum
FROM testtable tt
) tt
WHERE seqnum = 1
GROUP BY yearmonth;
Upvotes: 1
Reputation: 1293
You are looking for the number of new users each month, yes?
Here is one way to do it. Note that I had to use TO_DATE and TO_CHAR to make sure the month/year text strings sorted correctly. If you use real DATE columns that would be unnecessary.
An additional complexity was adding the empty months in (months with zero new users). Optimally that would not be done by using a SELECT DISTINCT on the base table to get all months.
create table x (
monthyear varchar2(20),
id varchar2(10)
);
insert into x values('02/18', 'A32');
insert into x values('04/19', 'T39');
insert into x values('05/19', 'T39');
insert into x values('04/19', 'Y95');
insert into x values('01/18', 'A32');
insert into x values('12/19', 'I99');
insert into x values('11/18', 'OPT');
insert into x values('09/19', 'TT8');
And the query:
with allmonths as(
select distinct monthyear from x
),
firstmonths as(
select id, to_char(min(to_date(monthyear, 'MM/YY')),'MM/YY') monthyear from x group by id
),
firstmonthcounts as(
select monthyear, count(*) cnt
from firstmonths group by monthyear
)
select am.monthyear, nvl(fmc.cnt, 0) as newusers
from allmonths am left join firstmonthcounts fmc on am.monthyear = fmc.monthyear
order by to_date(monthyear, 'MM/YY');
Upvotes: 0
Reputation: 2195
I assumed that you have a column that's a date (use of min() is necessary). You can do it by selecting a minimal date
(subquery t2
) for each id and then count only these rows that connect throught left join
, so if there is no connection you will get zeros for these dates or monthyear
as you have in your data.
select
monthyear
,count(t2.id) as cnt
from testtable t1
left join (
select
min(date) as date
,id
from testtable
group by id
) t2
on t2.date = t1.date
and t2.id = t1.id
group by monthyear
Upvotes: 0