Reputation: 399
I have a dataset with names of people who enrolled for the gym with the dates they actually went to the gym. I want to count the number of days they consecutively entered the gym. The dataset looks like below.
Name | Date |
---|---|
Ally | 1/1/2021 |
Ally | 1/2/2021 |
Ally | 1/3/2021 |
Ally | 1/5/2021 |
Ally | 1/7/2021 |
Brian | 1/12/2021 |
Brian | 1/13/2021 |
Brian | 1/16/2021 |
Brian | 1/21/2021 |
Brian | 1/22/2021 |
Here we know Ally went to the gym 3 days consecutively (1/1/2021-1/3/2021) and Brian went 4 days (1/12/2021-1/13/2021 and 1/21/2021-1/22/2021). I want the final result to look like this.
Name | Count |
---|---|
Ally | 3 |
Ally | 1 |
Ally | 1 |
Brian | 2 |
Brian | 1 |
Brian | 2 |
Is there a way to do this in SQL?
Upvotes: 0
Views: 2537
Reputation: 1270993
You can subtract a incrementing value from the dates to get a constant when the dates are incrementing by one. The rest is just aggregation.
Because you have not clearly specified a database, I'll use standard SQL syntax:
select name, count(*), min(date), max(date)
from (select t.*,
row_number() over (partition by name order by date) as seqnum
from t
) t
group by name, date - seqnum * interval '1 day'
order by name, min(date);
This should work in Postgres (which is close to Standard SQL). In SQL Server, it would be:
group by name, dateadd(day, -seqnum, date)
Upvotes: 4