Reputation: 465
I have the below table in bigquery:
Timestamp variant_id activity
2020-04-02 08:50 1 active
2020-04-03 07:39 1 not_active
2020-04-04 07:40 1 active
2020-04-05 10:22 2 active
2020-04-07 07:59 2 not_active
I want to query this subset of data to get the number of active variant per day.
If variant_id 1 is active at date 2020-04-04, it still active the follwing dates also 2020-04-05, 2020-04-06 until the value activity column is not_active , the goal is to count each day the number of variant_id who has the value active in the column activity, but I should take into account that each variant_id has the value of the last activity on a specific date.
for example the result of the desired query in the subset data must be:
Date activity_count
2020-04-02 1
2020-04-03 0
2020-04-04 1
2020-04-05 2
2020-04-06 2
2020-04-07 1
2020-04-08 1
2020-04-09 1
2020-04-10 1
any help please ?
Upvotes: 0
Views: 176
Reputation: 173028
Consider below approach
select date, count(distinct if(activity = 'active', variant_id, null)) activity_count
from (
select date(timestamp) date, variant_id, activity,
lead(date(timestamp)) over(partition by variant_id order by timestamp) next_date
from your_table
), unnest(generate_date_array(date, ifnull(next_date - 1, '2020-04-10'))) date
group by date
if applied to sample data in your question - output is
Upvotes: 2