S31
S31

Reputation: 934

Counting ID's for correct creation date time

I need to get the number of user ID's for each month, but they should only be counted for the month if the user's minimum month falls within that month.

So if customer A had a min(day) of 04/18 then for month and year, they would be counted.

My table looks like:

monthyear | id  
02/18       A32
04/19       T39
05/19       T39
04/19       Y95
01/18       A32
12/19       I99
11/18       OPT
09/19       TT8

I was doing something like:

SELECT day, id 
SUM(CASE WHEN month = min(day) THEN 1 ELSE 0)
FROM testtable
GROUP BY 1

But I'm not sure how to specify that for each user ID, so only user ID = 1, when their min(Day) = day

Goal table to be:

monthyear | count  
01/18       1
02/18       0 
11/18       1
04/19       2
05/19       0
09/19       1
12/19       1

Upvotes: 0

Views: 34

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269673

Use window functions. Let me assume that your monthyear is really yearmonth, so it sorts correctly:

SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY yearmonth) as seqnum
      FROM testtable tt
     ) tt
WHERE seqnum = 1
GROUP BY yearmonth;

If you do have the absurd format of month-year, then you can use string manipulations. These depend on the database, but something like this:

SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*,
             ROW_NUMBER() OVER (PARTITION BY id ORDER BY RIGHT(monthyear, 2), LEFT(monthyear, 2) as seqnum
      FROM testtable tt
     ) tt
WHERE seqnum = 1
GROUP BY yearmonth;

Upvotes: 1

Tad Harrison
Tad Harrison

Reputation: 1293

You are looking for the number of new users each month, yes?

Here is one way to do it. Note that I had to use TO_DATE and TO_CHAR to make sure the month/year text strings sorted correctly. If you use real DATE columns that would be unnecessary.

An additional complexity was adding the empty months in (months with zero new users). Optimally that would not be done by using a SELECT DISTINCT on the base table to get all months.

create table x (
monthyear varchar2(20),
id varchar2(10)
);

insert into x values('02/18', 'A32');
insert into x values('04/19', 'T39');
insert into x values('05/19', 'T39');
insert into x values('04/19', 'Y95');
insert into x values('01/18', 'A32');
insert into x values('12/19', 'I99');
insert into x values('11/18', 'OPT');
insert into x values('09/19', 'TT8');

And the query:

with allmonths as(
  select distinct monthyear from x
),
firstmonths as(
  select id, to_char(min(to_date(monthyear, 'MM/YY')),'MM/YY') monthyear from x group by id
),
firstmonthcounts as(
select monthyear, count(*) cnt
  from firstmonths group by monthyear
)
select am.monthyear, nvl(fmc.cnt, 0) as newusers
from allmonths am left join firstmonthcounts fmc on am.monthyear = fmc.monthyear
order by to_date(monthyear, 'MM/YY');

Upvotes: 0

M. Kanarkowski
M. Kanarkowski

Reputation: 2195

I assumed that you have a column that's a date (use of min() is necessary). You can do it by selecting a minimal date(subquery t2) for each id and then count only these rows that connect throught left join, so if there is no connection you will get zeros for these dates or monthyear as you have in your data.

select
     monthyear
    ,count(t2.id) as cnt
from testtable t1
left join (
select
    min(date) as date
    ,id
from testtable
group by id
) t2
 on t2.date = t1.date
 and t2.id = t1.id
 group by monthyear

Upvotes: 0

Related Questions